
volume 1 


the mathematical 
foundations 
of music 














Musimathics 




Musimathics 

The Mathematical Foundations of Music 
Volume 1 


Gareth Loy 


The MIT Press 
Cambridge, Massachusetts 
London, England 



© 2006 Gareth Loy 

All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including 
photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. 

MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, 
please e-mail <special_sales@mitpress.mit.edu> or write to Special Sales Department, The MIT Press, 5 Cambridge 
Center, Cambridge, MA 02142. 

This book was set in Times Román by Interactive Composition Corporation. 

Printed and bound in the United States of America. 

Library of Congress Cataloging-in-Publication Data 

Loy, D. Gareth. 

Musimathics : a guided tour of the mathematics of music / Gareth Loy. 

Ineludes bibliographical references and indexes. 

Contents: v. 1. Musical elements 

ISBN 0-262-12282-0—ISBN 978-0-262-12282-5 (v. 1 : alk. paper) 

1. Music in mathematics education. 2. Mathematics—Study and teaching. 3. Music theory—Mathematics. 

4. Music—Acoustics and physics. I. Title. 

QA19.M87L69 2006 
781.2—dc22 

2005051090 


10 987654321 






Contents 


Foreword by Max Mathews xiii 

Preface xv 

About the Author xvi 

Acknowledgments xvii 

1 Music and Sound 1 

1.1 Basic Properties of Sound 1 

1.2 Waves 3 

1.3 Summary 9 

2 Representing Music 11 

2.1 Notation 11 

2.2 Tones, Notes, and Scores 12 

2.3 Pitch 13 

2.4 Scales 16 

2.5 Interval Sonorities 18 

2.6 Onset and Duration 26 

2.7 Musical Loudness 27 

2.8 Timbre 28 

2.9 Summary 37 

3 Musical Scales, Tuning, and Intonation 39 

3.1 Equal-Tempered Intervals 39 

3.2 Equal-Tempered Scale 40 

3.3 Just Intervals and Scales 43 

3.4 The Cent Scale 45 

3.5 A Taxonomy of Scales 46 

3.6 Do Scales Come from Timbre or Proportion? 47 



Contente 


3.7 Harmonio Proportion 48 

3.8 Pythagorean Diatonic Scale 49 

3.9 The Problem of Transposing Just Scales 51 

3.10 Consonance of Intervals 56 

3.11 The Powers of the Fifth and the Octave Do Not Form a Closed System 66 

3.12 Designing Useful Scales Requires Compromise 67 

3.13 Tempered Tuning Systems 68 

3.14 Microtonality 72 

3.15 Rule of 18 82 

3.16 Deconstructing Tonal Harmony 85 

3.17 Deconstructing the Octave 86 

3.18 The Prospects for Alternative Tunings 93 

3.19 Summary 93 

3.20 Suggested Reading 95 

4 Physical Basis of Sound 97 

4.1 Distance 97 

4.2 Dimensión 97 

4.3 Time 98 

4.4 Mass 99 

4.5 Density 100 

4.6 Displacement 100 

4.7 Speed 101 

4.8 Velocity 102 

4.9 Instantaneous Velocity 102 

4.10 Acceleration 104 

4.11 Relating Displacement,Velocity, Acceleration, and Time 106 

4.12 Newton’s Laws of Motion 108 

4.13 Typesof Forcé 109 

4.14 Work and Energy 110 

4.15 Intemal and External Forces 112 

4.16 The Work-Energy Theorem 112 

4.17 Conservative and Nonconservative Forces 113 

4.18 Power 114 

4.19 Power of Vibrating Systems 114 

4.20 Wave Propagation 116 

4.21 Amplitude and Pressure 117 

4.22 Intensity 118 



Contents 


4.23 Inverse Square Law 118 

4.24 Measuring Sound Intensity 119 

4.25 Summary 125 

5 Geometrical Basis of Sound 129 

5.1 Circular Motion and Simple Harmonio Motion 129 

5.2 Rotational Motion 129 

5.3 Projection of Circular Motion 136 

5.4 Constructing a Sinusoid 139 

5.5 Energy of Waveforms 143 

5.6 Summary 147 

6 Psychophysical Basis of Sound 149 

6.1 Signaling Systems 149 

6.2 The Ear 150 

6.3 Psychoacoustics and Psychophysics 154 

6.4 Pitch 156 

6.5 Loudness 166 

6.6 Frequency Domain Masking 171 

6.7 Beats 173 

6.8 Combination Tones 175 

6.9 Critical Bands 176 

6.10 Duration 182 

6.11 Consonance and Dissonance 184 

6.12 Localization 187 

6.13 Externalization 191 

6.14 Timbre 195 

6.15 Summary 198 

6.16 Suggested Reading 198 

7 Introduction to Acoustics 199 

7.1 Sound and Signal 199 

7.2 A Simple Transmission Model 199 

7.3 How Vibrations Travel in Air 200 

7.4 Speed of Sound 202 

7.5 Pressure Waves 207 

7.6 Sound Radiation Models 208 

7.7 Superposition and Interference 210 

7.8 Reflection 210 



Contente 


7.9 Refraction 218 

7.10 Absorption 221 

7.11 Difffaction 222 

7.12 Doppler Effect 228 

7.13 Room Acoustics 233 

7.14 Summary 238 

7.15 Suggested Reading 238 

8 Vibrating Systems 239 

8.1 Simple Harmonic Motion Revisited 239 

8.2 Frequency of Vibrating Systems 241 

8.3 Some Simple Vibrating Systems 243 

8.4 The Harmonic Oscillator 247 

8.5 Modes of Vibration 249 

8.6 A Taxonomy of Vibrating Systems 251 

8.7 One-Dimensional Vibrating Systems 252 

8.8 Two-Dimensional Vibrating Elements 266 

8.9 Resonance (Continued) 270 

8.10 Transiently Driven Vibrating Systems 278 

8.11 Summary 282 

8.12 Suggested Reading 283 

9 Composition and Methodology 285 

9.1 Guido’s Method 285 

9.2 Methodology and Composition 288 

9.3 Musimat: A Simple Programming Language for Music 290 

9.4 Program for Guido’s Method 291 

9.5 Other Music Representation Systems 292 

9.6 Delegating Choice 293 

9.7 Randomness 299 

9.8 Chaos and Determinism 304 

9.9 Combinatorics 306 

9.10 Atonality 311 

9.11 Composing Functions 317 

9.12 Traversing and Manipulating Musical Materials 319 

9.13 Stochastic Techniques 332 

9.14 Probability 333 

9.15 Information Theory and the Mathematics of Expectation 343 



Contents 


9.16 Music, Information, and Expectation 347 

9.17 Form in Unpredictability 350 

9.18 Monte Cario Methods 360 

9.19 MarkovChains 363 

9.20 Causality and Composition 371 

9.21 Learning 372 

9.22 Music and Connectionism 376 

9.23 Representing Musical Knowledge 390 

9.24 Next-Generation Musikalische Würfelspiel 400 

9.25 Calculating Beauty 406 

Appendix A 409 

A.l Exponents 409 

A.2 Logarithms 409 

A.3 Series and Summations 410 

A.4 About Trigonometry 411 

A.5 Xeno’s Paradox 414 

A.6 Modulo Arithmetic and Congruence 414 

A.7 Whence 0.161 in Sabine’s Equation? 416 

A.8 Excerpts from Pope John XXII’s Bull Regarding Church Music 418 

A. 9 Greek Alphabet 419 

Appendix B 421 

B. l Musimat 421 

B.2 Music Datatypes in Musimat 439 

B.3 Unicode (ASCII) Character Codes 450 

B.4 Operator Associativity and Precedence in Musimat 450 

Glossary 453 

Notes 459 

References 465 

Equation Index 473 

Subject Index 475 




Foreword 


Musimathics by Gareth Loy is a guided tour-de-force of the mathematics and physics of music. It 
pulís no punches in presenting the scientific fundamentáis needed to really understand music, but 
at the same time it is so clearly written that readers willing to spend time can learn all they need 
to know to do basic research in modern technical music. Advanced placement courses in math and 
Science in any good high school are plenty of background—from there on Loy leads readers to 
wherever they want to go. 

Loy has always been a brilliantly clear writer. In Musimathics he is also an encyclopedic writer. 
He covers everything needed to understand existing music and musical instruments or to create 
new music or new instruments. 

Loy’s book and John R. Pierce’s famous The Science of Musical Sound belong on everyone’s 
bookshelf, and the rest of the shelf can be empty. 


Max Mathews 






Preface 


To start a great enterprise requires at the beginning only thefirst step . 1 

Mathematics can be as effortless as humming a tune, if you know the tune. But our culture does 
not prepare us for appreciation of mathematics as it does for appreciation of music. Though we 
start hearing music very early in life, the same cannot be said of mathematics, even though the two 
subjects are twins. This is a shame; to know music without knowing its mathematics is like hearing 
a melody without its accompaniment. 

If you are drawn to mathematics because of your love of music, then this book is for you. It pro¬ 
vides a commonsense, self-contained, self-consistent, self-referential introduction to these sub¬ 
jects for nonspecialist readers. It is designed for musicians who find their art increasingly 
mediated by technology, and it is written for anyone who desires to understand the intersection 
between art and Science. 

It has been my experience that there are many who want a deeper understanding of the math¬ 
ematics of music if the subject could be presented in a manner accessible to them. This book aims 
to meet that need. My goal is always to sustain readers’ motivation while competence is gradually 
built up in mathematical fundamentáis. 

Readers will need only average experience with mathematics and music—advanced high 
school math or college freshman algebra and some basic music theory. No knowledge of the cal- 
culus, apart from a small amount supplied in volume 2, is required. Some physics background is 
helpful, but the text supplies almost everything necessary for understanding. 

Virtually all of this book is focused on the mathematics of music: 

■ The topics are all subjects that contemporary composers, musicians, and music engineers have 
found to be important. 

■ The examples are all practical problems in music and audio. 

■ Even the fundamentáis are cast in terms of the goal: I try to make it clear up front why a foun- 
dation is relevant and what readers will be able to do with it once it is mastered. 

This is not a book for the mathematically inexperienced, ñor is it for experts. My aim is balance. 
I travel at a somewhat leisurely pace through this very remarkable material, examining not just 
its mathematical contení but its aesthetic and philosophical qualities as well. 





Musimathics presents the story of music engineering by examining its mathematics. Since engi- 
neering is basically about applying human valúes to nature, readers will discover a lot about them- 
selves, about the world of sound and music, and about what human cultures have valued. However, 
because I approach these valúes from an abstract perspective, they can be seen objectively, giving 
a better vantage point from which readers can make their own choices. 

There are three main directions of inquiry in volume 1: 

■ The materials of music: notes, intervals, scales 

■ The physical properties of music: frequency, amplitude, duration, timbre 

■ The perception of music and sound: how we hear 

■ Music composition 

Volume 2 presents a deeper cut into the underlying mathematics of music and sound, including 

■ Digital audio, sampling, binary numbers 

■ Complex numbers and how they simplify representation of musical signáis 

■ Fourier transform, convolution, and filtering 

■ Resonance, the wave equation, and the behavior of acoustical systems 

■ Sound synthesis 

■ The short-time Fourier transform, phase vocoder, and the wavelet transform 

The Web site, http://www.musimathics.com/, contains additional source material, animations, 
figures, and sources for other program examples in this book. Also, try saying “Musimathics” to 
your favorite Web browser and see what happens. 

About the Author 

This section is here to give readers a sense of comfort that they are in good hands. I received my 
Doctor of Musical Arts (DMA) degree from Stanford University in 1980 in composition of Com¬ 
puter music. I did my gradúate work at the Stanford Center for Computer Research in Music and 
Acoustics (CCRMA), one of the premier institutions for the study of this subject, then housed in 
the Stanford Artificial Intelligence Laboratory. I have been a performing musician all my life (vio- 
lin, guitar, lute, sitar, and voice) and am an award-winning composer (Bourges prize) and a 
National Endowment for the Arts grant recipient. I spent over a decade conducting research and 
teaching Computer music, electronic music, and musical acoustics at the University of California, 
San Diego, as Director of Research at the Center for Music Experiment. More recently I’ve been 
a Computer programmer, software architect, and digital audio systems engineer in various companies 
in Silicon Valley. I am president of a (very) small Corporation, http://www.GarethInc.com/, which 
provides engineering Consulting Services internationally. 



Preface 


But there’s more about me that you should know. Mathematics has never been an easy subject 
for me; I am a composer by training, not a mathematician. My academic career suffered badly in 
proportion to the amount of mathematics included in the syllabus. The aim of confessing this is 
paradoxically to give readers confidence. I know what it’s like not to comprehend mathematics 
easily, and I also know what it’s like not to give up. 

Notwithstanding my inability to add a column of figures and come up with the same answer 
twice, I found that mathematics was the lion in my path, the invariant obstacle to the realization 
of my artistic visions. So it was more out of necessity than facility that I carne to study mathe¬ 
matics. The composer Harry Partch constructed an entire orchestra of novel instruments to realize 
his artistic visión and once called himself “a composer seduced into carpentry.” By analogy, I 
suppose I’m a composer seduced into mathematics. 

I considered subtitling this book, “Everything I wanted to know about music when I was 
eleven.” At that age I prowled the stacks of a nearby university library in search of answers to my 
burning questions, only to discover that they were out of reach because I didn’t understand the j ar¬ 
gón in which the answers were written. At that age we are still intellectually fearless. In my expe- 
rience as a child and as a father and teacher, I’ve come to believe that there is nothing an 
eleven-year-old can’t understand given the right explanation. But by the time most of us have 
reached adulthood, this inquisitive quality is in eclipse, in large part because the right explanations 
are very hard to come by. This book is my gift to myself all those years ago, of all the best expla¬ 
nations I’ve been able to find or invent for many of the questions I had. And this book is my gift 
to you; may it help throw open the doors to the mathematics of music, one of the crown jewels 
of our civilization. 

C. G. Jung (1962) wrote, “The decisive question for man is: Is he related to something infinite 
or not? In the final analysis, we count for something only because of the essential we embody, and 
if we do not embody that, life is wasted.” 

In the storm called life, mathematics and music are two sure guides to that essential that we all 
embody. 

Acknowledgments 

This work was supported in part by a generous grant of love and encouragement from my wife, 
Lisa, and my children, Morgan, Greta, and Tutti. 

Thanks to all those whose passion for the subject has helped inflame my own, including my 
teachers Herbert Bielawa, John Chowning, Andy Moorer, John Grey, Loren Rush, Leonard Ratner, 
and Leland Smith. Thanks to those who have helped keep the dream in focus: Connie Strohbehn, 
Shari Carlson, Linda Grahm. 

I am grateful to all whose scholarship and research have fed into the rich stream of knowledge 
that this book can at best sample and summarize. The enormous list of these individuáis begins 
with the bibliography of this book and extends recursively through all the influences they cite. If 





there is anything to praise in this work, it is because it reflects the wisdom of these antecedents; 
if there is fault, it is mine alone. 

Thanks to those courageous individuáis who reviewed chapters of this book prior to publica- 
tion: Charles Seagrave, Stan Green, Dana Massie, Mark Kahrs, Richard Kavinoky, Malcolm 
Slaney, John Strawn, Dan Freed, Herbert Bielawa, Stephen Pope, Roy Harvey, Julius Smith, Ted 
Marsh, Mark Dolson, Andy Moorer, Robert Owen. 

Thanks also to the mockingbird outside my window whose song at this late hour reminds me 
of the universality of music. 

Gareth Loy 

Corte Madera, California 



Music and Sound 


"H ow did you know how to do that?" he asks. 

"You just have to figure it out." 

"I wouldn't know where to start," hesays. 

I think to myself, That's the problem, all right, where to start. To reach him you have to back up and back up, 
and thefurther back yougo, thefurther back you seeyou havetogo, until what looked likeasmall problem 
of communicatión turns into a major philosophic enquiry." 

— Robert M . Pirsig, Zen and the Art of M otorcycle M aintenance 

The problem of finding the right placeta beginanexplanation is rather likefinding the rightfulcrum 
pointtomoveastonewithalever, Putting thefulcrumpointtoo cióse to the stone provi des great lever- 
age but little range of movement (figure 1.1a). Putting it too far from the stone provides great range 
of movement but no leverage (figure 1.1b). Thefulcrumpointof anexplanationistheknowledgeand 
assumptionsthereadermustalready haveinordertomakesenseoftheexplanation.Theassumptions 
are liketheaxioms in geometry: a short listof simple, self-evidentfactsfrom which the entiresubject 
can ultimately bederived. 

This chapter is such a fulcrum for the rest of this book, and it therefore runs the greatest risk of 
overwhelm or underwhelm. Given thechoice, l'vedecided to err on the si de of underwhelm. The 
rest of this chapter i ntroduces some basi c properti es of sound that wi 11 become i mmediately useful 
in chapter 2. If it looks likethere are no surprises here, skip this chapter. 

And if thissubjectisnew toyou, I haveasuggestion: if any of the material seemsbeyond you at 
times,just read itlikeamystery novel. Seriously! I recommend thisapproach based onyearsof per¬ 
sonal experience reading things I didn't at first understand. You don’t have to speak fluent French 
in orderto enjoy Paris, butyou'll certainly get more outof it if you pick some up along the way. 

1.1 Basic Properties of Sound 

Ifyouweretostri keatuningforkand holditnexttoyourear,youwould hearoneof nature’spurest, 
simplestsounds. What you hearisaresultof theperiodicchangesinairpressureatyoureardrum 
caused by thevibration of the ai r set in motion by the tiñes of thefork (figure 1.2a). Figure 1.2b is 
a representad on of the ai r mol ecul es i n the vi ci ni ty of the fork, show i ng areas of greater and I esser 



C hapter 1 




Hard to lift but the rock moves farther 


Figure 1.1 

Fulcrum. 



Figure 1.2 

Sound wavefrom a vibrating tuning fork. 

air pressure radiating away from the fork as it vibrates, similar in some respects to the way water 
waves radíate away from a stone thrown into a pond. 

1.1.1 Physical Properties 

Therateof periodic pressurechangeisfrequency, and thestrength of pressure f I uctuati ons i s i nten- 
sity.Theonsetisthetimewhen the sound begins, and itsduration isthelength of timewecanhear 
it. The characteristic way in which the intensity of a sound changes through time is its envelope. 

O ne fi nal attri bute, wave shape, compl etes the basi c I i st of the physical properti es of sound. O ur 
hearing uses the shape of sound waves to characterize sound quality. We use words like "puré," 
"shrill," and "muffled" to describe wave shapes. We also use wave shape to identify the type of 
sound source, for instance, a trumpet or an oboe. 

There are many other important properties of sound, such as the direction it comes from and 
what it means to us. B ut frequency, i ntensity, onset, durati on, envelope and wave shape are enough 
to start with. 



M usic and Sound 


Frequeney ismeasuredaseyelesper second.Theunitof onecyclepersecond ishertz (Hz) (see 
section 4.3.1). Humans can hear sound over the rangeof about 17 Hz to about 17,000 Hz. Sound 
intensity is measured in decibels (dB) (see section 4.24.1). From soft to loud, intensity of sound 
rangesfrom thethreshold of hearing atabout40dB in very quietroomsup tothelimitof hearing 
at about 120 dB, al so cal led thethreshold of pain. Duration is measured in seconds. 

1.1.2 Perception of Sound 

Even though our senses are connected directly to the world, our inner experience of phenomena 
is not ¡dentical to the stimuli wereceive. Our perception dependsupon a multitudeof interacting 
factors, i ncl udi ng the sensitivity of our sense organs and the vari ous ways our brai ns can be wi red; 
even the culture of our bi rth and our location in ti me and space affect our experience of the world. 
So our language has developed terms that relate our inner experience to outer phenomena. 

For simple sounds such as a tuning fork, the principal physical properties of sound are pretty 
closely related towhatwehear. When thehigh- and low-pressurewaves from the tuning fork have 
propagated through the ai r to the ears, they push and pul I on the ear drum at the same rate that the 
tuni ng fork created them (j ust as the reeds at the edge of a pond rock back and forth from the waves 
created by a stone thrown into the water). The ears report the frequeney of these air pressure 
changes to the brai n as pitch. The i ntensity of the pressure changes is reported to the brain as loud- 
ness. If there are no changes in air pressure around the ears (that is, if the atmospheric pressure 
remains unchanged), wehear silence. In a musical context, onsetand duration of sounds are per- 
ceived as elements of rhythm. 

Loudness, pitch, onset, and duration seem to be reíatively straightforward one-dimensional 
measuresof our experience. A sound getslouderorsofter; higher or lower; fasterorslower, much 
the same way as a thermometer rises and falls with temperature. M easuring timbre, on the other 
hand, is not so simple. 

Laterl expl ai n that the physical and psychological aspectsof sound cannot be compartmental- 
ized quite as neatly as I ’vesuggested here, and that timbre i s notas hard to study as itatfi rstseems. 

1.2 Waves 

A wa ve i s an organized travel i ng disturbance i n a medi um, such as ai r. The medi um itself does not 
flow because of thewave; rather, a disturbance in the médium travels through the médium. Waves 
transmitenergywithouttransmitting matter. F or i nstance, part of theenergy from the vi brati ng tun- 
ing fork is transferred to the ear. 

1.2.1 WaveShape 

When I describe a wave as organized, I mean that it has a characteristic shape. Our ears are very 
sensitive to the shape of pressure changes in sound waves as they strike our ears. Throughout our 
lives welearn to associate particular wave shapes with particular sound sources. Wealso usethis 



C hapter 1 



Figure 1.3 

The wave shape of a vibrating tuning fork. 


information to identify a sound’s reíative location and important characteristics about our environ- 
ment. T he wave shape of a tuning fork i s very si mple i n compari son to most other sounds. I f we graph 
the average partióle density of the tuning fork sound shown in figure 1.2b, we see a shape similar to 
figure 1.2c. 

T he vi brati on of the ti nes of a tuni ng fork i s very smal I and too rapi d for the eye to see. B ut sup- 
pose we could view this motion, for example, by attaching a miniscule pen to one of its tiñes and 
then quickly passing a roll of paper underneath whi le it vi brates. Undermagnifi catión the vibration 
might be seen to leave a wavy mark on the paper (figure 1.3). The wave shape would be similar 
to the one in figure 1.2c. 

1.2.2 Simple Harmonic Motion 

Thebackandforth motion of the tuni ng fork tiñe shown i n figures 1.2 and 1.3isknown as simple 
harmonic motion. Understanding this motion is fundamental to understanding all kinds of 
vibration, including music, the quantum mechanical motion of an atom, and the celestial music 
of the spheres. This motion iseasiestto visualize when it is made up of the interplay of inertia 
of a mass and the elastic forcé of a spring. For the tuning fork, the mass and the spring are 
j ust different aspects of the same metal I i c substance: the metal has both i nerti a and el asti c forcé. 
Butwecan better visual ize simple harmonic motion by suspendí ng a large massfrom theend 
of a spring (figure 1.4a). This allows us to neglect the mass of the spring and the elasticity of 
the mass. 

If left undisturbed, the mass will eventually come to rest at its pointof equilibrium, where 
the downward forcé of gravity equals the upward-lifting spring forcé. But if it is disturbed 
from its equilibrium position, the mass will víbrate up and down in simple harmonic motion 
(figure 1.4b). 

1.2.3 GuidedTour of Simple Harmonic Motion 

Ifl pulí down onthe mass and reí ease i t, the forcé of the stretched spring lifts the massupwardagainst 
gravity and against the inertia of the weight, attempting to restore it to its equilibrium position. As 
its velocity increases, momentum tendsto keep the masstraveling upward. The spring beginsto go 



M usic and Sound 



, ^ [Max. pos. displacement 
0 velocity 

| Max, neg. acceleration 

[Ó displacement 
Max. velocity 
> 10 Acceleration 

[Max. neg. displacement 
0 velocity 

| Max, pos. acceleration 


Figure 1.4 

Simple harmonio motion. 


slack as the mass rises, and when the mass reaches theequilibrium point, the spring no longer lifts 
the mass upward. Butthe mass continúes to rise above the equilibrium point in spiteof the slack 
spring, though its velocity slows. When itsmomentum isexhausted, the mass stops ata point of máx¬ 
imum positive displacement from equilibrium, and its velocity momentarily goesto zero. 

The slack spring cannot hold the mass above its equilibrium point, so with its upward momen- 
tum spent, the forcé of gravity takes over and begins to pulí the mass downward. Its velocity 
increases until it reaches its equilibrium point again. The mass continúes to fall below the equi- 
librium point, though itslows because it is increasi ngly opposed by the ti ghtening spring. The mass 
stops ata poi nt of máximum negativedisplacementfrom equilibrium, and its velocity momentarily 
goes to zero. Then the eyele repeats. 

Now go back to the inítial moment, while I was still holding the mass below its equilibrium 
point. At that moment, the mass had zero velocity and zero acceleration. The moment I released 
it, it had zero velocity, but máximum acceleration. As the mass rose to approach its equilibrium 
point, its acceleration diminished, but its velocity continued to grow. At the equilibrium point, 
acceleration was zero, but velocity was máximum. Above equilibrium, the mass decelerated and 
velocity diminished, until at máximum positive displacement, velocity was zero. 

Then the same processtook place in reverse. Atthe moment it began its downward movement, 
acceleration was máximum, velocity was zero. As the mass approached its equilibrium point, its 
acceleration diminished, but velocity continued to grow. At the equilibrium point, acceleration 
was zero, but velocity was máximum. Below equilibrium, the mass decelerated and velocity 
diminished, until at máximum negative displacement, velocity was again zero. 

F igure 1.5 shows the moti on of the spri ng/mass system through ti me. Poi nts marked A, B, and C 
i nfigure 1.4 areshown as I ines in figure 1.5 for reference. The mass achieves its máximum velocity 




C hapter 1 



Figure 1.5 

Displacement, velocity, and acceleration of simple harmonio motion. 




Figure 1.6 

Other sources of simple harmonio motion. 

in the instant itcrosses its equilibrium poi nt (B), and atthis poi nt it has zero acceleration. The mass 
achieves its máximum acceleration in the instant it reaches its point of máximum displacement 
from equilibrium (A and C), and atthis point it haszero velocity. When the mass has máximum 
velocity (and zero acceleration) we say it has peak kinetic energy, and when the mass has máxi¬ 
mum acceleration (and zero velocity) we say it has peak potential energy. 

If wechangedtheinertiaof themassortheelasticity of thespring, we'd change its characteristic 
speed of vibration. If we used a heavier weight, thefrequency would go down; if we used a stiffer 
spring, thefrequency would go up. But the characteristic shapeof the motion would remain. If we 
stretched thespring farther before letting it go, we'd increase the total potential and kinetic energy 
of the vibration, giving italargeramplitude. Butagain, the characteristic harmonic motion would 
remain. 

There are many examples of simple harmonic motion in the universe. Thetuning fork and the 
spring/massexampleand the examples in figure 1.6 areall simple mechanical vibrating Systems. 
Even the basilar membrane, which is the organ within our hearing system that converts acoustic 
energy i nto nerve i mpulses, vi brates usi ng the same pri nci pie of si mple harmonic motion. Simple 





M usic and Sound 



Figure 1.7 

Sinusoid— simple harmonio motion through time. 

harmonic motion can al so be studied in electrical, optical, Chemical, thermal, atomic, and other 
natural Systems. 

1.2.4 Sine and Sinusoid 

Look again at figure 1.3. Tracing the shape made by a body moving in simple harmonic motion 
through time, we observe it makes a characteristic curve. Such a curve is a sinusoid. Simple har¬ 
monic motion is sinusoidal motion. 

Figure 1.7b shows one period of the sinusoid generated by the spring and weight apparatus 
shown in figure 1.7a. Notice that the spring and weight makethepen movefastestwhen thewave 
crossesthecenterline. This pointisalso whereitsacceleration reverses (going from acceleration 
to deceleraron). Thus, sinusoidal motion captures all the salient features of simple harmonic 
motion through time. 

The term sinusoidal means having the shape of a sine wave. Sine motion is a mathematical 
abstraction of simple harmonic motion, justas a point isa geometrical abstraed on of a location in 
space. Wecan makean ink doton a pieceof paper and say it represents a geometrical point; sim- 
ilarly, a particular sinusoidal motion can be sai d to represent sine motion. But both sine motion and 
geometrical point really existonly in our minds, and the sinusoid and ink dotaretheir real-world 
counterparts. 

Here'sthedifference: aswewill seein chapter5, sine motion hasaprecise mathematical def- 
inition in terms of circular motion. Because it is based on the circle, sine motion is a timeless 
description of motion having no beginning or end. Thus, sine motion is a mathematical ideal, 
an infinite, perfect motion that cannot exist outsideof our imaginations. On the other hand, any 
reasonabl e approxi mati on of si ne moti on (such as the one show n i n f¡ gu re 1.7) can be cal I ed si nu- 
soidal. Because no physical motion can more than approximate ideal sine motion, all such 
real-world approxi mati ons are by definition sinusoidal. 




C hapter 1 



Figure 1.8 

Damped waveform of a plucked musical instrument. 

1.2.5 C onservative and Nonconservative Forces 

Unless we continually supply energy to an object vibrating ín simple harmonic motion, it will 
eventually cometo restat its equi li brium position because its energy isconstantly being dissi pated, 
radiated away as heat and/or sound. Theeffect of energy dissi pation on a vi brating system is damp- 
ing. Figure 1.8showshow asinusoidgenerated bythesystem infigure 1.7 mightlookthrough time 
because of the interplay of vibratory forces and dissipative forces. 

If all the energy drainsaway atonce, there can be no vibration, because then there’s no energy 
leftwith whichto víbrate. Buteven if the energy drainsaway slowly, all the energy will eventually 
dissi pate compl etely. T hi s suggests that there are conservative and nonconservative forces at work 
simultaneously in vibrating Systems. The conservativeforces operatewithin the system to perpet¬ 
úate vibration, w hile the nonconservative forces opérate between the system and its surroundings 
to dissi pate energy through friction, and radíate energy through heat and sound. The balance 
between these two kinds of forces determines how the system vibrates. 

■ A spri ng’s elastic forcé is a conservative forcé that is constantly transformi ng the spring’s up and 
down movement from potential to kinetic energy and back again as the system vibrates. 

■ Theexternal frictional forcé of ai r resistance and the internal friction of the spri ng itself are non¬ 
conservative forces that dissi pate the system's energy into its surroundings, until total energy in 
the system has returned to its equi I i brium. 

N ote i n fi gure 1.8 that only the ampl i tude of the damped waveform changes through ti me, whi le 
the frequency (here represented as the distance covered by each repeated waveform) remains the 
samethroughout. 

In common usage, theterms "oscillate" and "víbrate" areoften interchanged. Butthey are not 
the same: a system vibra tes when itmovesorswingsfromsidetosideregularly; a system oscillates 
if it moves or swings from sideto side continuously and regularly. Henee, a sinusoid oscillates, 
whereas a plucked string vibrates. 



M usic and Sound 


1.3 Summary 

The physical properties of sound include frequeney, intensity, onset, duration, and wave shape. 
Frequency, onset, and duration aretime-based aspeets of sound, and intensity isa measureof the 
energy in a sound. These physical properti es of frequency and intensity correspond to the percep- 
tual cuesof pitch and loudness. Onset and duration largely determine musical rhythm. 

A waveisan organized traveling disturbancein a médium thattransmits vibradng energy with- 
out transmitting matter. The simplestwave shape is the sinusoid, generated by simple harmonic 
motion. This motion is created by the interplay of elastic forces and inertia. The velocity of an 
object moving in simple harmonic motion is greatest near its equilibrium point; acceleration is 
greatest near the extremiti es ofits excursión. Ifwegraph simple harmonic motion intime, itmakes 
a sinusoidal shape. 

The forces that sustain vibration are conservative forces; the forces that cause damping are 
nonconservative forces. 




Representing Music 


Both mathematical notation and musical notation pointto universes quite different from theone in which 
ordinary languagefunctions so well. But, in each too, there is genius in thevery notation thathas developed 
for giving representation to ideas that seem to lie beyond ordinary language. There are times in mathematics 
when thesimilarities in notation isthefirstclueto adeeperrelationship. Similarly musical notation notonly 
created astructurewithin which Western music could develop butalso shows something otherthan justthe 
sounds being made. It indicates how the variouselementsstand in relation to oneanother, how sound creates 
a space, it shows how different musical voices move against and through each other. The notation in both 
subjects can make visible the hidden connections within each subject that reveal hidden connections among 
outside phenomena. 

— Edward Rothstein, Emblems ofthe M ¡nd 

J ust as music comes al ive in the performance of it, the same is true of mathematics. The symbols on the page 
have no moreto do with mathematics than the notes on a pageof music. They simply representtheexperience. 

— Keith Devlin, Mathematics: The Science ofPatterns 

Our ears are conti nuously bombarded with a stream of pressurefl uctuations from the surroundi ng 
air, notunliketheway ocean wavesceaselessly beatupon theshore. Nonetheless, ourearsdiscern 
di serete events in this conti nuous flow of sound and assign them meaning, such as footsteps, a 
baby's cry, or a musical tone. 

J ust as the geometrical point is a mental construct that helps us navigate the underlying conti- 
nuity of space, so the musical tone isa f ree creati on of the human mi nd that we apply to the unbro- 
ken ocean of sound to hel p us organize and make sense of what we hear. T hough its defi ni ti on has 
been stretched to the breaking point by recent musical trends, tone is still the fundamental unitof 
musical experience. Thischapterlaysoutthebasicsof music representation from a mathematical 
perspective, laying thegroundwork for subsequent chapters. 

2.1 Notation 


The realm of personal musical experience lies entirely within each one of us, and we cannot share 
our inner experienees directly with anyone. However, many world cultures have developed Sys¬ 
tems for communicating musical experience by representing it in symbolic written and verbal 



C hapter 2 


forms. As members of a particular culture, we learn from childhood to map our inner experienees 
of music onto particular symbols which carry meaning that all members share. Thisallows usto 
speak and writeabout music, learn and perform the works of others, transe ri be and analyze musical 
performances, and teach music, among other things. All this is possi ble because of the innate 
human capacity to abstract musical tones from the continuous stream of sound and to represent 
thesetones symbolically. 

This chapter characterizes one such system: the Western common music notation system 
(CM N). Its preval ence today makes it a good entry point to a broader discussion of the mathe- 
matical basisof tuning Systems (seechapter3). Understanding CM N will help usto fully appre- 
ciate its relationship to other musical traditions as well as to understand the history of tuning 
Systems and current musical research. 

2.2 Tones, Notes, and Scores 

InCM N a tone ischaracterized by threesonic qualities: pitch, musical loudness, and timbre. When 
a tone is combined with two additional temporal qualities, onset and duration, the result is a note. 
A note isa tone placed in a particular temporal context. 

N otes are combined in temporal order to create a musical score, which provides the necessary 
context to correctly i nterpret the performance of the notes. Roughly speaking, when notes are per- 
formed in sequence, the result is melody, and when notes are performed simultaneously, the result 
i s harmony. T he context provi ded by a score i ncl udes the sequence order of the notes and thei r ti m- 
ing as well as other details of how the notes areto be played on particular musical Instruments. 

F igure 2.1 shows a complete score written in C M N consi sti ng of a single note. The score iswrit- 
ten out on astaffof five horizontal Unes that serves as a grid indicating pitch range. The relative 
pitch and duration of a noteare indicated by placing note symbols such as J (quarter note), «h (eighth 
note), J (half note), and „ (wholenote) on thestaff lines. Themapping of pitchesto staff lines is 
determined by thetypeand placementof thec/efsign, placed attheleftof each staff. Theclefmark 
in figure 2.1 is the G clef, |.T hespí ral in this Symbol encirdes the second-to-bottom line, indi¬ 
cad ng that this staff line corresponds— by ancient convention—to the pitch G. This pitch is 
one whole step below A440, the reference pitch used to tune all modern Western Instruments. 
Another common clef is the F clef, y. When placed on a staff, its two vertical dots bracket the 
second-to-top line, ¡ndicating that this staff I i ne corresponds— by the same anci ent conventi on— to 
the pitch F, afifth below middleC. 



Figure 2.1 

A score of a single note in Western common music notation. 



Representing M usic 



Figure 2.2 

A mplitude function of the score in figure 2.1. The waveform has been shortened to make it fit the page. 

If we were to record a musical instrument performing this score, the waveform might look like 
the one in figure 2.2, which shows fluctuating air pressureA as a function of time t. Figures 2.1 
and 2.2 are j ust different views of the same i nformation, the former descri bing the sound symbol- 
i cali y, the latter descri bing it physically. 

Each view has advantages and disadvantages. The functional view provides a great deal of 
¡nformation about how a particular performer realized (performed) the note, allowing us to ana- 
lyze the physical vibration of the instrument. But it is generally not useful to give such a rep¬ 
resentaron to another player to describe how to play the same note. For this, the Symbolic 
approach is superior. 

T here are many useful representati ons of tones, each of w hi ch has advantages and di sadvantages 
in different contexts. For instance, although we can easily derive pitch, loudness, and duration 
informationfromeitheramusical score orfrom a functional representaron like figure 2.2, neither 
gives much direct insight into timbre (see chapter 6). 

2.3 Pitch 

Frequency is a physical measure of vibrations per second. Pitch is the corresponding perceptual 
experience of frequency. 

Pitch has been defined as "that auditory attribute of sound according to which sounds can be 
ordered on ascalefrom low to high" (ANSI 1999). Unfortunately,stipulating precisely what"that 
auditory attribute" is turns out to be a complex scientific affair that has spanned across centuries 
of research. Whileoursenseof pitch isproportional to frequency, itisalso influenced by frequency 
range, loudness, and the presenceof other higher or lowerfrequencies. Pitch is limited to sounds 
within the range of human hearing, but frequency is not. 

There areatleasttwomotivationsfordeveloping measurements of pitch: scientific curiosity and 
therequirements of music engineering. I take up the scientific interests in chapter 6. M eanwhile, 
there is the more pragmatic problem of engineering the pitch range of human hearing for musical 
purposes so that we may communicate musically about pitch. 




C hapter 2 


2.3.1 Frequency and Pitch 

If we restrict ourselves to si mple tones such as might comefrom a fl ute or tuning fork, then for some 
tone with frequency f we hear some correspondí ng pitch p. F or i nstance, if the frequency of a tuni ng 
fork is f = 440 Hz, then the pitch p that we hear is conventionally called A440, the pitch commonly 
used by modern Western orchestras to tune all instruments together. The reference pitch used by 
orchestrashasnotalwaysbeen set at 440 Hz but has vari edthrough the ages. Itbecamestandardized 
at 440 vibrations per second in the early part of the twentieth century (see section 3.2.3). 

2.3.2 I ntervals and Frequency 

An interval is the difference in pitch between two tones. The sensitivity of ourearsto intervals is 
the basis of melody and harmony. 

If a reference tone has frequency f R , then a tone with frequency f R ■ 2 1 issaidto be one octave 
higher. If the frequency is f R ■ 2 2 , then it is two octaves higher. Generalizing, the frequency f x 
of any octave xof the reference frequency f R is 

f x = f R - 2\ xel. Octaves (2.1) 

This equation says, "T he frequency x octaves above reference frequency f R is equal to the ref¬ 
erence frequency ti mes 2 rai sed to the power of x." T he expressi on x g I means thatx i s an element 
of the set of al I integers— all possi ble positive and negative whole numbers. H ere it suffices to say 
that x g I means thatx can be any integer. The significance of requiring x to be an integer is that 
frequency f x will only be an octave of f R ifx isan integer valué. 

If x = 0, the frequency of f x is in unisón with f R because f x = f R ■ 2 o = f R . Ifx = -1, the 
frequency of f x isan octave bel ow f R because then f x = f R ■ 2 l = f R /2. Ifweallowx tobe any 
integer, all octaves of t R can be realized. 

2.3.3 C haracter of I ntervals 

Our ears are extremely sensitive to the intervals of unisón and octave, and virtually all cultures 
organize thei r music primarily around these intervals. The unisón has the musí cal quality of iden- 
tity. Forexample, if two ilutes intone A440, wesay their pitch isidentical. 

Octaves have a musical quality of equivalence. If identity means that two pitches sound the 
same, equivalence means that we can tell them apart buteach can serve the same musical purpose 
equival ently. In virtual lyevery musical culture, pitches i n any octavecan perform the same musí cal 
function, a principie known as octave equivalence. 

If the rangeof x in equation (2.1) is expanded to inelude all real numbers, then wecan obtain 
the frequency f x of any arbitrary interval xof reference frequency f R : 

f x = f R ■ 2 X , x g R . Interval (2.2) 

The expressi on xg R means x is an elementof the set of real numbers (in otherwords,x can 
be any real number). Real numbers include all integers and all possiblefractional valúes between 



Representing M usic 


15 


the integers as well. Real valúes in the range O < x < 1 select frequencies within the first octave 
above f R . Valúes x < 0 select frequencies below f R , valúes x > 1 select frequencies beyond the 
f¡ rst octave above f R , and so forth. 

A n exponent appears in equations (2.1) and (2.2) as the independent vari able; itseemsthatour 
neural anatomy is wired to perceive an exponential relation between pitch and frequency. Fre- 
quency f goes up exponentially as pitch p goes up linearly: to double pitch, we must quadruple 
frequency. 1 

2.3.4 Inter val Ratios 

The frequencies of tones that make up an interval can be compared by making a ratio of their 
frequencies. For instance, 

The interval of a unisón is 1/1. 

The interval correspondíng to one octave up is 2/1. 

The interval corresponding to one octave down is 1/2. 

Consider the interval formed by the frequencies 880 Hz and 440 Hz. This ratio can be reduced 
to the lowestcommon denominator: 

880.2 
440 ' 1' 

The same is true of 132/66, 34/17, and so on. The advantage of expressing intervals as ratios in 
the lowestcommon denominator is that the kind of interval can beseen directly without the com¬ 
plicad on of the actual frequencies involved. 

2.3.5 Categorizing Intervals 

If the unisón expresses identity and the octave expresses equival ence, the rest of the intervals 
signify individuality. Each of the intervals has a unique character to its sound— like a unique 
personality— that the ear can readily detect regardless of wide vari ations i n frequency, amplitude, 
duration, or timbre. Our hearing seemsto organize intervals by a subjective sense of distancethat 
can becharacterized asheightorwidth: the interval of a fifth (3/2 = l 1 /?) is experienced as "higher" 
or "wider" than a fourth (4/3 = 1 v 3 ) . In chapter 6 this quality is called chroma. Intervals figure 
prominently in music because they are so readily distinguished by our hearing. 

2.3.6 Organizing Pitch Space 

E quati on (2.1) shows that there are an i nfi ni te number of pi tches because we can assi gn any val ues 
toreferencefrequency f R oroctavex. Buttoengineerapractical scalesystem requiresthatwetake 
into accountthe realistic limits imposed by our hearing. 


Determining the Range of Pitch Space First, we can only hear frequencies between about 
17 H z and about 17,000 H z (higher general ly for youths and women, lower for rock concert 




C hapter 2 


aficionados, people who listen to music over headphones at elevated levels, people who drove 
Volkswagens in the 1960s, and the aged— especially the aged who drove Volkswagens to rock 
concerts whilewearing headphones). 

Even within thisfrequency range, pitches above about 4000 Hz are difficultto tell apart. Rec- 
ognizing this, the musical engineersof the world's musical traditionshavehistorical ly set real ¡Stic 
limits on the frequency range used by musical instruments to represent distinct musical pitches. 
The piano hasoneof the widest pitch ranges of traditional instruments. Its lowest pitch is about 
27 Hz, and its highest isa littlelessthan 4000 Hz. 

DeterminingtheDensity of Pitch Space If pitches are crowded too closely together in fre¬ 
quency, we havea hard ti me tell i ng them apart. Becauseof this, the world's musical engineers have 
I i mited the total number of pitches that cover the range of pitch space so that each can be easily iden- 
tified. Insometraditionsthereareasfewasadozen pitches al together. The Western orchestra provides 
only about 90 total pitches toworkwith. So even thoughtherearethousandsof potentially identifiable 
pitches in the range of human hearing, relatively few are actually selected for use in musical scales. 

Assigning Pitches To communicate about music, we must be able to ñame the pitches and asso- 
ciatethem with frequencies. This is notan engineering problem so much asadesign question, and 
each culture has answered it in a manner that speakstowhatis i mportantto that culture. In the West 
the choices have been profoundly influenced by the ¡deas of Pythagoras (see chapter 3). 

2.4 Scales 

A musical scaleisan ordered setof pitches, together with a formula for specifying their frequen¬ 
cies. Each individual pitch of a scale is called a degree. The degrees are an ordered set of ñames 
and positions for the scale pitches. 

M ost musical traditions have acknowledged the importance of the unisón and octave intervals 
by organizing their scales around them like anchor points. M ost scales associate ñames of the 
degrees with their frequencies in one octave only, with the understanding that because of octave 
equivalence, degrees of the scale can be played in any other octave yet still perform the same 
musical function in the scale. 

In an unfortunatetwist of terminology, the degrees of the scale are al so sometí mes cal led pitch 
classes. (I'd ratherthey'd been called something like degree classes.) In any event, each degree isa 
memberof aclass that itshares with thesamedegreeinallotheroctavesbecauseofoctaveequivalence. 

2.4.1 Gamut 

A term related to scale is gamut, the entire range of notes reachable by an instrument or voice. 
W hereas a scaletheoretically has no limits in frequency, a gamut does, as it is always tied to a par¬ 
ticular instrument that can play only so high or so low. 

"Gamut" is actually a compound of two other terms: the Greek letter gamma, r, used as a Symbol 
for the lowest tone of the medieval musical scale, and "ut," the first syllable of athen-well-known 



Representing M usic 


hymn to St. J ohn, the melody of which has the peculiarity of beginning one degree higher with each 
successive phrase. "Gamut" thus represents "all thetonesfrom gamma onward" (A peí 1944). 

2.4.2 Diatonic Scale 

T he prototy pe of al I scalesystemsintheWestisthecí/aton/csca/e. Ithasseven pitches per octave, 
named with theseven letters C, D, E, F, G, A, and B correspondíng to the seven degrees of this 
scale. 2 Thedegrees of the diatonic scale are named tonic (1), supertonic (2), mediant(3),subdom- 
inant (4), dominant (5), submediant or superdominant (6), and subtonic (7). They are represented 
in CM N as shown in figure 2.3. This scale may al so be familiar as the scale that goes with the 
solmization syllables do, re, mi, fa, sol, la, ti. 3 

Thediatonic scalecontainstwo interval sizes, theha/fstep and the wholestep. A wholestep con¬ 
tai ns exactly two half steps. The whole step and the half step are al so cal led whole tone and semi- 
tone. Chapter 3 detaiIs the frequencies that go with each diatonic scale degree and thefrequency 
sizeof the half and whole steps. Herel focusonly on the orderof the interval sizes. The interval 
order of the diatonic scale is the sequence of whole and half steps in the scale. 

The interval order and the starting degree are the two primary identifying characteristics of the 
diatonic scale that hold regardless of the pitch the scale starts on. 

F i gure 2.4 shows the i nterval order of the di atoni c scal e. N ote the characteri sti c order of i nterval 
sizes: {2, 2,1, 2, 2, 2,1}, and observe that the scale starts on the first degree. For our purposes, 
thesetwo characteristics completely define thediatonic scale. Note the asymmetri cal structureof 
the interval order: there's a group {2, 2,1} followed by {2, 2, 2,1}. The unique order of whole 
and half steps provides a crucial asymmetry that our hearing exploits in order to orient ourselves 

Degree: 1 2 3 4 5 6 7 

Letter: C D E F G A B (C) 

Note: 

Syllable: do re mi f a sol la ti (do) 

Wholestep: i-1 Half step: 

Figure 2.3 

Diatonic scale. 



Starting 

degree 

I 

Degree: 1 2 

I nterval size: 2 2 


3 4 5 6 7 

“l/"\l II II 1^"" 
1 2 2 2 1 


Figure 2.4 

I nterval order of the diatonic scale. 



C hapter 2 



Figure 2.5 

Piano keyboard. 

to the music we'rehearing. If the interval pattern werenot asymmetrical, it would be impossible 
for us to orient oursel ves i n the se ale. 

2.4.3 Staff L ines and the Piano Keyboard 

Look atfigure2.3 again and noticethat the staff I i nes hide the asymmetry of thediatonic interval 
order visually. Each successive degree of the scale moves vertically up the staff by the same 
distance regardless of whether the interval between the successive degrees is a semitone or 
a whole tone. However, the asymmetry can't be hidden in the layout of the piano keyboard 
(figure 2.5). W hen starti ng from C, the i nterval pattern of the keyboard i s the same as thediatonic 
interval order. 

2.5 I nterval Sonorities 

Groups of i nterval sshare sonorities, commontraitsthatallow usto group them together (table2.1). 
The sonorities correspond to the sonic character of the intervals. Perfect intervals have a quality 
that has been descri bed as el ear, pri sti ne, structural, or astri ngent. Major intervals and minor inter¬ 
vals supply a warmth or feelingful character. Augmented intervals and dlminished intervals pro- 
vide a piquaney or strangeness that can be disturbing. Table 2.1 shows the elassificatión of the 
intervals. I ntervals can al so beclassified as consonantordissonant(seesection 3.10). 

2.5.1 M ajor and M inor Scales 

A nother ñame for the diatoni c scale i s the major scale. The minor scale uses the standard diatoni c 
interval order butstartson degree 6. Table 2.2 shows three octaves of thediatonic scale from left 
to right. Thediatonic interval order ishighlighted in themiddle row, and the minor interval order 
isshown below it. 

Ifweprojectone octave of thediatonic scale clockwiseon ádrele, as i n figure 2.6, weseethat 
the mi ñor scale isthesameas the major scalesta rted two diatonic degrees counterclockwisearound 
the ci relé. So the major and mi ñor scales are related by theunderlying diatonic order and aredis- 
tinguished only by their starting degrees. 




Representing M i 


Table 2.1 

Interval Classification by Sonority 


Class 

Ñame 

Semitones 

Description 

Perfect 

U nison 
Octave 

Fourth 

0 

12 

5 

Provides harmonio anchoring and framework. 


Fifth 

7 


Major 

Third 

Sixth 

Seventh 

Second 

4 

9 

11 

2 

Provides expansiveemotional color. 

Mi ñor 

Third 

3 

U pper pitch is one semitone smaller than major intervals. 


Sixth 

Seventh 

Second 

8 

10 

1 

M inor intervals provi de a contractive emotional color. 

Diminished 


6 

U pper pitch is one half step less than a minor or a perfect 
interval. A diminished fifth iscalled a tritone. 

A ugmented 


6 

U pper pitch is one half step greater than a major or a 
perfect interval. A n augmented fourth is also called a 
tritone. 


Table 2.2 

Diatonic and M ¡ñor Scale Interval Order 

Diatonic Degree ...123456712345671234567 

Diatonic interval order ...2 2 1 2 2 2 1 2 2 1 2 2 2 1 2 2 1 2 2 2 1 

Mi ñor interval order ... 2 2 1 2 2 ¡~2 1 2 2 1 2 2 I 2 1 2 2 1 2 2 2 1 



Figure 2.6 

Major and minorscales. 



C hapter 2 


1 lonian (Major) 



Figure 2.7 

Starting the diatonic scale on other degrees to create modes. 

2.5.2 Modes 

Starting a scale from other than degree 1 or 6 produces scales that are other than major or mi ñor 
butthatshare the diatonic interval order. Called modes, these variations of the diatonic scale order 
are shown in figure 2.7. The initial degree of a mode is its final because typically music in a mode 
would end on that note. So the final of lonian mode is 1 and the final of Aeolian is 6. 

The ñames derive, evidently, from seventeenth-century F rench music theorists, who named the 
modes arbitrarily after regions of Greece (Apeí 1944). (The music theory of the ancient Greeks 
bears no resemblanceto these modes.) The diatonic modes are the tonal basis of Gregorian chant 
and of early Western music (until about 1600 c.e.). 

N otice that the major and mi ñor scales are synonymsfor lonian and Aeolian modes, respec- 
tively. The various modes can be played on the white keys of a piano simply by starting the mode 
on the degree i ndicated in the figure. Forexample, starting on degree 4 produces the Lydian mode. 
The Lochrian mode is purely a theoretical mode, considered unusable by conventional music the¬ 
ory because of the tritone that exists between its final (7) and its fourth degree. 

The Iistener may notice that some of the modes, especially Phrygian and M ixolydian, have 
a kind of antique quality to their sound. Before the advent of tempered tunings (see chapter 3), 
composers exploited the modes as an i mportant source of tonal contrast. Shifting between modes 
wasa way to add structureand shapeto acomposition. Flowever, with the arrival of transposable 
Instruments in the Baroqueperiod, interest in modes declined, as key transposition took over the 
role of the modes to structure music. This left only the major and minor scales in common use. 
Flence, music builtupon modal scales can sometí mes suggestan ancientqualitytotheWestern ear. 

2.5.3 C hromatic Scale 

Thechromatic scale extends the diatonic scaleby breaking upthewholestepsinto half steps and 
adding these new half steps to the scale. It uses the standard diatonic letter ñames A-G butadds 
symbols that raise or lower each diatonic degree by a semitoneto indícate these in-between half 



Representing M usic 


21 


With sharps 


With fíats 


Degree: 1 2 3 4 5 6 7 8 9 10 11 12 



Ñame: C C # D E F F # G A # B 

D l> E l> q, Al, B|, 


Figure 2.8 

Chromatic scale in common Western notation. 



Figure 2.9 

Diatonic scale ñames with chromatic and enharmonic inflections. 


steps. The Symbol t (sharp) raises a diatonic degree by asemitone, and the Symbol i, (fíat) lowers 
it a semitone. The Symbol \ (natural) restores a previously sharped or flatted pitch to its diatonic 
degree. Sharp, fíat, and natural are accidentáis. G ¡ven theorderof half and whole steps in the dia¬ 
tonic scalefrom which it is constructed, there are thus 12 semitones in the chromatic scale: 

{A, (A 11 B i,), B, C, (C, | DJ, D, (D, | Et), E, F, (F, | G G, (G» | Ai,)}, 

where the Symbol | means or. Thus, one may write either A» or B^, since they are enharmonic 
equivalents— they sound the same pitch. On the piano, for example, A» and B^ are the same phys- 
ical key (see figure 2.5). 

The musical representad on ofall 12 pitches of the chromatic scale in CM N isgiven in figure 2.8. 
This scale can equivalently bewritten using fíats instead of sharps (orany mixture). Thefactthat 
the degrees of the chromatic scale are named by their position with respect to the degrees of the 
diatonic scale shows again that the chromatic scale was derived from it. 

In addition to the standard chromatic enharmonic spelling using sharps and fíats, degrees can 
also be represented using double sharps (x) and doubleflats (w,), which raiseorlower their respec¬ 
tive degrees by two semitones (figure 2.9). The degree ñames in each column are enharmonic 
equivalents, thus C x = D = En,. 





22 


C hapter 2 



Figure 2.10 

Diatonic scalein keysof G andF. 


-«——. Fiat keys Sharp keys . 

C, G, D, A, E, B, F C G D A E B F, C, 

6 k \ 1» 2, 3, 4, 5* 6* 7, 

-Numberof fíats N umber of sharps-► 

Figure 2.11 

Transposition versus accidentáis required for the diatonic scale. 

2.5.4 Transposing 

If a scale is started on any chromatic degree but C, it is said to be transposed. The diatonic scale 
can be transposed to any chromatic degree so long as the diatonic interval orderof wholeand half 
steps is preserved. For instance, if we begin the diatonic scale on G, then F must be sharped to 
preserve diatonic interval order; similarly, if westartiton F, then B mustbeflatted. F igure 2.10a 
shows the diatonic scale transposed to G, and figure 2.10b shows it transposed to F. 4 The degree 
to which the diatonic scale is transposed is called the key. For example, the diatonic scale trans¬ 
posed toG by the introduction of F»is the Acey of G. The untransposed diatonic scale is the Acey 
ofC. 


2.5.5 KeySignature 

N otice that F is a fifth bel ow C, whi Ie G i s a fifth above C. Transposi ng the di atonic scale to begi n 
on F requiresoneflat: B^. Transposing toG requiresoneSharp: F¡. Aswego down by fifthsfrom 
C, the scal e bui It on each subsequent transposed degree requi res the i ntroduction of one more fl at 
i n order to preserve the i nterval order of the di atoni c scale. Correspondí ngly, as we go up by fifths 
from C, the scale built on each subsequent pitch requires the introduction of another sharp. This 
resultisshown pictorially in figure 2.11. 

A majoror minor scale can beerected on any of the chromatic degrees by appropriate appli- 
cation of accidentáis to establish the correct major or minor interval order. The accidentáis 
required to start a majoror mi ñor scal eon each chromatic degree are shown i n figure 2.12. These 
are cal led key signatures because they stipulate the association between the key (the chromatic 
degree that the scale starts on) and the accidentáis required for the corresponding diatonic 
scale. 




Representing M i 



Major: C G D A E B F# Cjj 

Minor: A E B Fj C| G| D| A| 



Major: C F Bt E[, A i, G|, Ct 

Minor: A D G C F B^ E|, A^ 


Figure 2.12 

Key signatures. 

Figure2.12allowsustoinferfromascorewhatthekeyshouldbe. Forexample, ifwe observe three 
sharpsinascore.wecaninferthatitscorresponding majorscalemuststartonA and itscorrespondíng 
minor scale must start on F t . 5 Since the major and minor keys that share a key signature are related 
by the underlying diatonic interval order, they are called the relative major and relative minor. For 
example, the relative major of minor is major, whilethe relative minor of A major is F t minor. 

2.5.6 Circleof Fifths 

As we move farther away from the key of C in figure 2.11, enharmonically equivalent keys start 
to crop up. In particular, the key of is enharmonically identical to the key of C», the key of 
is the same as the key of F», and the key of C^is the same as the key of B. This suggests that there 
is a circularity involved in the key structure, which becomesapparentif wetwistthekey sequence 
shown in figure 2.11 into aspiral, as shown in figure 2.13. 

Th\s\s\he circleof fifths, although itiseasierto represent as a spi ral, since itcould continué into 
the double sharps and double fíats, and so on. There are only 15 useful mappings of the diatonic 
interval order onto thechromatic scale, namely theones shown in figure 2.11. 

2.5.7 Nondiatonic ScaleOrders 

Of course, the diatonic scale specifies but one of many possible orderings of intervals. W hile dia¬ 
tonic ordering has had immense influence on music of cultures around the world, we're free to 
choose any orderi ng that serves our needs. The fol lowi ng is a select sampli ng of some nondiatonic 
scales. M ore are consi dered in chapter 3. 

Pentatonic Scale If the diatonic scale is the father of scales, the pentatonic scale must be the 
grandfather, for itappearsin virtually every culture worldwide. Its interval order is {2,3,2,2,3}. 
The black keys on a piano are an instance of the pentatonic scale. Like the diatonic scale, one can 
create pentatonic modes by choosing a different starting degree (figure 2.14). 

Harmonic M inor Scale This scale (figure 2.15) uses the interval order of the minor scale but 
raisestheseventh degree by onesemitone. Its ¡nterval order is {2,1,2,2,1,3,1}. The minor scale 



24 


C hapter 2 


C \ 



Figure 2.13 

Spiral (circle) of fifths. 



C D F G A c 


Figure 2.14 

Pentatonic scale. 



C D E t F G A B c 


Figure 2.15 

Harmonio mi ñor scale. 


(seesection 2.5.1) is sometí mes calledthe natural m/norsca/etodifferentiateitfromtheharmonic 
minor. The seventh degree of the diatonic scale is sometí mes cal led the leading tone because it 
seemsto lead the earto thetonic. Raising the seventh degree of the natural minor lendsthi si mpor- 
tant harmonio function to the minor scale. 

Melodic Minor Scale This scale (figure2.16) vari es i tsorder dependí ng upon the melodic func¬ 
tion of themusic— henee its ñame. Ithas an ascending order, which is used when the music rises 
up the scale, and a descending order, which is used when the music goes down the scale. The 



Representing M usic 


25 



C D E t F G A B c B t A t G F E k D C 

Figure 2.16 

M el odie minorscale. 


A ugmented 2d 



C D Ei F, G B c 
Tritone 


Figure 2.17 

Hungarian minor. 



C D E F, G, A, c D|, F G A B D t 
First kind Secondkind 


Figure 2.18 

Whole-tonescales. 

ascending order of this scale is like the harmonio minor but with the sixth degree al so raised by 
a semitone. The descendíng form is identical to the natural minor. 

Hungarian Minor This minor scale (figure 2.17) has an augmented second between the third 
and fourth degrees, and an augmented fourth (tritone) from firstto fourth, lending itaspicy, rakish 
quality. 

Whole-Tone Scale Astherearel2 chromatic degrees per octave, picking every othersemitone 
yields a scale contai ningonly six degrees (excluding the octave), al I ofthem wholetones. Its inter¬ 
val order is symmetrical: {2,2,2,2,2,2}. Since we pick every other degree, thereare necessarily 
two kinds of whole-tone scale (figure 2.18). The chromatic degrees of the first kind are 2 n, 
n = 0,1,..., 5, and the degrees of the second kind are 2n + 1 over the same range (counting the 
first degree of the chromatic scale as 0). 

Because the whole-tone scale interval order i s symmetrical, itdoes not provi de the ear with the 
anchoring asymmetry supplied by, for example, the diatonic interval order, leaving Iisteners har- 
monically "at sea." An obviouscompositional device isto altérnate between the two whole-tone 
scalesforcontrast.A falling whole-tonescalegivesaparticularly vulnerableand "slippery" feel- 
i ng to the fal I. C omposers as vari ous as CI aude D ebussy 6 andTheloniousMonk 7 have featured this 
scale in their compositions. 



C hapter 2 


2.6 Onsetand Duration 


The duration of a notéisthe number of beats itlasts.Thebeatisthefundamental unitoftimemea- 
surement and corresponds to the pulse of the music— in other words, whatyou tap your foot to. 
Beats are grouped into measures, set off from other measures in a score by bar Unes. 

Theonsetof anoteisthemomentstipulatedby thescoreforittobegin.countedinbeatsfrom the 
beginning of the score. The onsettime of a note is the same moment counted in seconds from 
the beginning of the score. Onset time can be calculated by multiplying the number of beats 
from the begi nni ng ti mes the duration of a beat. 

2.6.1 Relative Duration 

M usical symbols for relative note duration are given in the upper row of table 2.3. The symbols 
inthelower row i ndi cate the duration ofrests, the sil enees between notes. In table 2.3, each symbol 
indicates a duration one half as long as the Symbol to its left. Shorter durations, such as one 
thirty-second can becreated by adding moreflags ()) to thestem of the note. 

Additional relative durations can be derived from those in table 2.3 as needed by the addition 
of dots to the right of notes or rests. A single dot extends the duration of the note or rest by 1/2. 
For example, J. =J + J>, and ¡. = ¡ +7. A second dot increases the duration of the note or rest by an 
additional 1/4. For example, J„ =J +J> + A and t..=¡ + i+y. In general, n dots after a note or restof 
duration D indícate that the total duration Tis 


2.6.2 AbsoluteDuration 


The absol ute duration of any note i s determi ned by a metronome mark on the score i n conj unction 
with the duration symbols in table 2.3. The metronome mark indicates which duration symbol gets 
the beat and how many beats there are per minute. For example, the metronome markj = 60 mm 
i ndi cates that the quarter note gets the beat and that there are 60 beats per mi ñute. Thus, each quar- 
ter note lasts for one second. 

The tempo isthe number of beats per minute. Rubato, small perturbadons in the tempo, may 
be employed by performers informally to emphasize a phrase or delinéate a symmetry in the 
music. 


Table 2.3 

CMN Symbolsfor Relative Duration 



Whole 

Half 

Quarter 

Eighth 

Sixteenth 

Note 

„ 

J 

J 



Rest 

- 

- 






Representing M usic 


Table 2.4 

Time Signatures 


Two quarters 

Three quarters 

Four quarters 

per measure 

per measure 

per measure 

Common time 9 

Six eighths 

N ine 32ds 

c 

per measure 

per measure 


8 



Note: a. Same as 4/4 ti me. 


The suffix M M on the metronome mark has an interesting history. It stands for "M álzel 
M etronome." J ohann Nepomuk M álzel was not the inventor of the metronome, which honor is in 
factdueto Diedrich NikolausWinkel (1773-1826) of Germany. ButMálzel wasashrewd busi- 
nessman who patented Winkel's invention in England and Franee before Winkel could do so. So 
successful was his marketing effortthat only M álzel's ñame remains commonly associated with 
the metronome (Tiggelen 1987). 

2.6.3 Time Signatures 

The rhythm of ascoreisdetermined by the ti me si gnature i n much the same way that the scale is 
determined by the key si gnature. The timesignature sti pul ates how many beats there are per mea- 
sureand whatbeats arestressed to establish the rhythm (table2.4). Commontime groupsfourquar- 
ter notes per measure. It is notated with a capital letter C. 

Notall beats havean equal stress when performed. Often thefirst beat is stressed, whileother 
beats in a measure receive less stress. A few conventional stress patterns are associated with the 
most common ti me signatures. F or example, common ti me and 4/4 ti me stress beat 1 the strongest 
and beat 3 somewhat less; the other beats are unstressed. F or 3/4 ti me, typi cal ly beat 1 i s the stron¬ 
gest, beat 3 is stressed less, and beat 2 is unstressed. Liketheasymmetrical structureofthediatonic 
scale, the asymmetry in stress patterns helps orient the listener in the measure. 

2.7 M usical L oudness 

Thesound intensity of many musical instruments can be adjusted overacertain range, dependí ng 
upon their construction. The range from the softest to loudest sound for an instrument is its 
dynamic range. Some instruments, such as the harpsichord, are fixed at one loudness level. The 
oboe has a small dynamic range, and the pipe organ has quite a wide dynamic range. Loudness 
depends upon a number of perceptual and acoustical factors, and is not easy to characterize i n gen¬ 
eral terms (see section 6.5). 

N onethel ess, C M N provi des a very si mpl e notati on for dy nami c I evel s. Part of every musi ci an's 
trai ni ng i s to I earn how to transí ate the C M N symbol s for dy nami c I evel to the appropri ate I oudness 



C hapter 2 


Table 2.5 

CM N Indicationsfor Dynamic Range 


Pianississimo 

PPP 

As soft as possi ble 

M ezzo forte 

mf 

M oderately loud 

Pianissi mo 

PP 

Very soft 

Forte 

f 

Loud 

Piano 

P 

Soft 

Fortissimo 

ff 

Very loud 

M ezzo piano 

mp 

M oderately soft 

Forti ssissimo 

fff 

As loud as possi ble 


level for his or her instrument, depending upon musical context. The nuances of this context are 
quite subtle and extensive, usual ly requiring years to master. 

The CM N indications for dynamic range are shown in table 2.5. The Italian ñames are univer- 
sally used, I suppose becausethey invented theusages, which weresubsequently adopted by other 
European countries. The dynamic range indications in table 2.5 areentirely subjective. I describe 
how to relate them to objective measurements in section 4.24. 

For instruments that can change dynamic level over the course of time, the "hairpin" Symbol 
indicates a gradual increase in loudness, while indicates a gradual decrease. Bowed 
and blown instruments can usual ly effect a change in dynamic level during the course of a single 
note. Struck instruments including pianos general ly can't change the dynamic level of anoteafter 
it is sounded but can change dynamic levels over the course of several notes. The proper inter- 
pretation of these cues is part of every musician's training. 

2.8 Timbre 

I n musical scores, timbre means the ty pe of i nstrument to be pl ayed, such as vi ol i n, trumpet, or bas- 
soon. But timbre al so i s used in a general sensetodescribean instrument'ssound qual ity as Sharp, 
dull, shrill, and so forth. 

H ow quickly an i nstrument speaks after the performer starts a note, whether it can be pl ayed with 
vibrato, and many other instrumental qualities arealso lumped together astimbre. Timbre al so gets 
mixed up with loudness becausesome instruments, likethetrombone, get more shrill asthey get 
louder. Asa consequence, it's easier to say what timbre isn’t than what it is: timbre is everything 
about a tonethat isnot its pitch, not its duration, and not its loudness. However, negativedefinitions 
are slippery and provideno new information. 

There are other ways of representing tones that shed positive light on timbre. J ust as colors can 
be shown to consi stof mixtures of light atvarious frequencies and strengths, soundscan be shown 
to consi stof mixtures of si nusoids atvarious frequencies and strengths (seevolume 2, chapter 3). 
F or i nstance, when we hear a note pl ayed on a trumpet, even though our ears tel I us we are heari ng 
a single tone, in fact we are heari ng simpler tones mixed together in a characteristic way that our 
minds—perhaps through long experience, perhaps through some intrinsic capabi I i ty— f use into 
the perception of a trumpet sound. 




Representing M usic 


29 


Partíais 

Z+2/+3/+4/+5/+6/+7/+ ... 

f Overtones 

Fundamental 

Figure 2.19 

Harmonic overtone series. 

The individual sinusoids that collectively makeupan instrumental tone are called ¡ts partíais 
because each carries a partial characterization of the whole sound. Partíais are al so known as 
components, and I will usetheseterms interchangeably. The principal propertiesof the partíais are 
their frequencies and amplitudes. The way the partíais manifest in frequency, amplitude, and time 
is what our ears useto determine what kind of instrument made a particular sound. 

2.8.1 Partíais, F undamentals, and O vertones 

Thelowestpitched partial i n a tone is cal led the fundamenta/. I tis general ly what our ears pickout 
as the pitch of the tone. Since, by definition, theremaining partíais in the tone are pitched higher, 
they are cal led overtones. 8 Our ears use the pattern of overtone frequencies as an important cueto 
recognize timbres. The overtone frequencies of wind and string instruments are positive integer 
múltiples of the fundamental, where the positive i ntegers arel, 2, 3, and so on. For instance, if a 
fluteor violín has fundamental frequency f, then the frequencies of ¡ts overtones will be positive 
i nteger multi pies of f (figure2.19). The partialsof such instrumentsarecalledharmon/cs. Note that 
because the positive numbers start at 1, and because 1 xf=f, therefore the first harmonic is the 
same as the fundamental. 

Instruments with harmonic partíais are usual ly chosen to carry the melody and harmony of 
music because frequencies of theharmonicstend toagree in frequency with the pitches of thedia- 
tonic scale. Instruments with inharmonic partíais such as drums and bel Is are usual ly not used to 
carry mel ody and harmony because for the most part the frequenci es of thei r parti al s do not agree 
with the diatonic scale. 

The amplitudes and frequencies of the partíais of musical instruments tend to vary in a charac- 
teristic way over theduration of a tone, dependíng upon the instrument and performance styleof 
the performer. Though the variation may beslight, the precise amplitude and frequency ballistics 
of the partíais help our ears to fuse a single percept of an instrument out of ¡ts individual partíais, 
and help identify thetype of instrument. 

2.8.2 Vibration Modes 

Each partial is created by a specific part of the vibrating system of the instrument. Consider a 
vibrating string, for example. Its fundamental frequency is created by that portion of the total 



30 


C hapter 2 


Mode4 


Figure 2.20 

Vibration modes. 

energy in thestring that vibrates coherently along its entire Iength (mode 1 in figure 2.20). Vibra- 
tion along the entire I ength of a stri ng is called mode 1 vibration. 

Not all the energy in a stri ng vi brates in mode 1; some energy pushes one part of the stri ng 
down while the other end counters it by rising (mode 2 in figure 2.20). The frequency of this 
vibration is twice the frequency of the fundamental, corresponding to the second harmonio. 
Someof the stri ng's energy causes it to vi brate i n three balanced regions (mode 3 in figure 2.20) 
corresponding to the third harmonio. For many vibrating Systems (but not all), the higher the 
mode, theless energy it has. Stringed Instruments can havedozensof vibration modes with sig¬ 
nificad energy. 

Not all vibrating Systems contai n all possi ble modes. The clarinet has energy only at the fun¬ 
damental and odd-numbered harmonios. Some vibrating Systems do not divide the vibrating 
médium into integer ratios as the stri ng does. The inharmonic partíais of instruments such as bells 
and drums are not integer múltiples of a fundamental. 

2.8.3 Spectra 

When weprojectsunlightthrough aprism, theresulting rainbow of colors, its spectrum, reveáis 
the individual colors of sunlight. The prism distributes the colors into a linear sequencefrom low 
to high frequencies. The intensity of each color in the rainbow indicates the contribution of that 
color to the quality of sunlight. 

So, too, the spectrum of a sound shows the i ntensi ti es and f requenci es of the si nusoi ds that make 
up the sound. A spectrum shows the energy distribution of a waveform in frequency. 

The spectrum comprises the set of all possible frequencies fromto oo H z at al I possible 
intensitiesfrom 0 to °° dB (measuring up from silence). The spectrum of a particular sound will 
be a subset of this infinite two-dimensional space. 

For example, fi gure 2.21 shows four waveforms and thei r correspondí ng spectra. The top wave¬ 
form isa single sinusoid. Its spectrum shows a single vertical line. Theline's horizontal position 
gives the sinusoid's frequency, and its height gives the sinusoid’s intensity. The spectrum of 
the second waveform shows it contai nstwo si nusoi ds, the fundamental at frequency fand the third 



Representing M usic 


31 


Waveform 


Spectrum 

Fundamental 1 

1|A 

i 

i 

?■ Time \ / 

< 1 




2 * 

/ 2 / 3 / 4 / 5 / 6 / 7 / 


Frequency 

Fundamental and 3d harmonio 



1 
§ 

2 


/ 2 / 3 / 4 / 5 / 6 / 7 / 
Frequency 


Fundamental, 3d, and 5th harmonios 


£ | Time \ 

Fundamental, 3d, 5th, and 7th harmonios 
f- Í Tinre\ 


Figure 2.21 

Fl armonio waveforms and spectra. 

harmonic at frequency 3 f. The harmonio has less energy because its line is shorter than thefun¬ 
damental. The last two waveforms show additional odd-numbered harmonios being added, each 
with higherfrequency and less energy than theirpredecessors. Ifwecould hearthelastwaveform, 
it would sound somewhat like a clarinet. Since all frequencies are integer múltiples of the 
fundamental, these are harmonic spectra. Because the components in figure 2.22 are noninteger 
múltiples of the fundamental, this spectrum is an inharmonic spectrum. Percussion instruments 
such as bel Is, gongs, and drums produce inharmonic spectra. 

Static and Dynamic Spectra In theforegoing discussion, I haveconveniently neglected time 
as a required element. In order to compute the spectrum of a sound, we must have some length 
of it to analyze. If we wish to capture all the spectral information available in a waveform, the 
mathematics of spectral analysis requires us to observe the sound not just over its full duration 
but actually over all of time, from minus infinity to positive infinity. This is clearly a physical 
impossibiIity. Fortunately, there are mathematical techniques that allow us to analyze sounds 
with limited length. However, the shorter the waveform, the less precisely we can characterize 
its spectrum. So there is some inherent uncertainty between the temporal and spectral views 




C hapter 2 



/ 2.5/ 4.6/ 7.2/ 9.1/ 13.4/ 

Frequency 


Figure 2.22 

An inharmonic spectrum. 

of waveforms of finite length. This subject is related to Heisenberg's uncertainty principie (see 
volume2, chapter 3). 

Thelength of sound availableforspectral analysisdeterminesthekind of spectrum wecan cré¬ 
ate. A static spectrum shows the energy distribution of partíais averaged over a fairly long period 
of time, suchas theduration ofan entire note. Figures 2.21 and 2.22 representstaticspectrabecause 
they show the average intensities of the partíais over the duration of an entire note. Because static 
spectra show averages, they cannot show how the energy distribution of a sound changes dynam- 
ically over the duration of the note. Static spectra can be useful, for instance, to confirm whether 
a sound is harmonio or inharmonic. 

Dynamic Spectra Our ears are highly attuned to the way the spectra of soundschangethrough 
time, and we rely on this information to help us identify the type of instrument making a sound. 
The vibrational energy radiated by musical ¡nstruments evolves through ti me i n a characteristic 
way based on the physical properties of the instrument and how the musician performs it. The 
dynamic elements in an instrument's spectrum that are contributed by the performance inelude 
vibrato, tremolo, glissando, crescendo, and decrescendo. There are al so dynamic properties of the 
instrument's vibration that are largely determined by the interaction of the physics of the instru¬ 
ment and the physi es of the performer's touch. Clearly, itwould bevery useful if wecould capture 
the way spectra evolve through time. 

Suppose we have a musical note lasting a few seconds. We can observe how its energy distri¬ 
bution evolves through time as follows: 

1. Break the notedown into asequenceof short sound segments each lasting a small fraction of 
a second. 

2. Take the static spectrum of each sound segment separately. 

3. A ssemble the spectra i n ti me order. 

Imagine printing each static spectrum on a paneof glass, then assembling the panes in time 
order. Looking through the panes, we can observe how the spectrum of the sound changes through 



Representing M usic 



Figure 2.23 

Dynamic spectrum. 

time. This three-dimensional result is a dynamic spectrum because it shows spectral evolution 
through time. Figure 2.23 shows an idealized dynamic spectrum as a set of static spectra in time 
order. Thex-axisshowstime, they-axis intensity, and thez-axisfrequency. Dashed lines connect 
partíais atthe same frequency in adjacent spectral slices, showing how each partí al's amplitude 
changes through time. 

F igure 2.24 shows the spectral evol ution of a string tone. We can tel I a great deal about a sound 
by looking at its spectrum through time. For instance, theeven spacing of the partíais along the 
frequency axis suggests a harmonic spectrum. There are relatively few partíais with significant 
energy. M ostenergy is concentrated i n the lowest partíais, and energy dropsquickly with increas- 
ing pardal number. The lower harmonics start sounding rather more quickly than the higher har¬ 
monios, as indicated by the broad grey line across thecomponents at the beginning, and higher 
harmonics drop out more quickly, as indicated by the broad grey line across thecomponents at 
theend. 

M uch of the aliveness we hear in a musical tone is communicated to us by the way the instru- 
ment's timbre changes instant by instant. The scrape of the bow on a violin string before the note 
sounds, or the puff of air that precedes an alto saxophone tone, or the characteristic way the over- 
tones of a trumpet tone change strength during the course of a note provide important elues about 
whatwearehearing. 

The sonogram is another way to graph dynamic spectra (figure 2.25). Time is shown on the 
x-axisand frequency on they-axis, and thethickness of the line shows the intensity of the spectral 



34 


C hapter 2 



Figure 2.24 

Amplitude, frequency, and time plot of a stringed instrumenttone. (Adapted from a drawing in Grey 1975.) 

components.Theresultisatwo-dimensional imageofthesoundthathasthree-dimensional infor- 
mation. This sonogram represents four distinct bird chirps. 

2.8.4 AmplitudeEnvelope 

A tone’s partíais can be represented using just amplitude, frequency, and time. 

■ If we look at these three attributes together, we see the tone's spectral envelope in three dimen- 
sions (figure 2.24). But wecan reduce this information to two dimensions by averaging. 

■ Averaging the amplitude of each partí al separately through time, we get the tone’s static spec- 
trum intwo dimensions: amplitude vs. frequency (figures 2.21 and 2.22). 

■ A veragi ng the ampl i tude of all partí al s together through ti me, we get the tone's amplitude envelope 
intwo dimensions: ampl ¡tude vs. time (figure 2.26). Figure 2.26 follows the ampl ¡tude contourof 
thewaveform in figure 2.2. 

Clearly, these are just three different views of the same ¡nformation. 



Representing M usic 


35 


6 kHz 


4 kHz 


f 


( 


i ( 


rv 0 f\ r\ 


OkHz 


200 ms 400 ms 600 ms 800 ms 


Figure 2.25 

Sonogram of four bird calis. 


Attack Decay Sustain Release 



Figure 2.26 

Amplitude envelope of waveform shown in figure 2.2. 

The amplitudeenvelopeof a note reveáis in general how an instrument dissipates theenergy it 
receives from the player through time as sound. A mplitude envelopes are conventionally divided 
i nto four segments: 

■ Attack, the period of time from silence, when exciting energy is fi rst applied to the instrument, 
until the instrument is maximally dissipating its energy. Typical attack times are about 10 ms to 
50 msformostinstruments. Energy mayflood unevenly through the instrument atfirst, resulting 
in vibrational instabilitiesthat produce, for instance, a scratching sound in violins or a warbling 
in brasstones. Theear is highly attuned to theseinstabilities and uses informadon about how the 
sound starts, grows, and stabilizes to identify the source of the tone. 



36 


C hapter 2 


■ Decay, whichfollowstheattack. Someinstrumente (brasses in particular) declineback to asus- 
tainable level based on theamount of exciting energy being applied continuously. 

■ Sustain, the period following the decay. The instrument stabilizes so thattheamountof energy 
being dissipated matches the exciting forcé. 

■ Release, the final portion of the sound from the moment no more energy is injected into the 
instrument until all energy i s dissi pated and it becomes silent. 

Together, this classification is denoted AD5R, named for the initial letters of each segment of 
theamplitudeenvelope. Thesecategoriesarecompletely arbitrary and by no meansfittheampli- 
tude envelopes of most real Instruments playing real music. For example, struck Instruments such 
as the piano have no sustain segments because they receive no sustai ni ng forcé after the hammer 
strikes the string. Legato performance effects, where an instrument plays overlapping notes, are 
notwell modeled by thissystem either. Nonetheless, itissometimesaconvenient shorthand and 
quite commonly found in sound synthesizers. 

2.8.5 Bandsand Bandwidth 

A band is a range of frequencies within a spectrum. The bandwidth of a sound is the distance 
between upper and lowerfrequency limits of a sound. The band center of a band is its mean fre- 
quency. The bandwidth of human hearing is approximately 17 Hz to 17 kHz. 

Soundsvary enormously i n bandwidth. The bandwidth of a jet engi ne or a waterfall exceedsthe 
audible spectrum. These are called broadband sounds. The tuning fork has a very narrow band¬ 
width and is cal led narrowband. M ost musical instrument tones lie somewhere between. 

2.8.6 Resonance 

How is it that a musical instrumentoravoicecan strengthen one partial and attenuateanother? 
The answer is that musical Instruments are not as efficient at producing some frequencies as 
others. Where an instrument has a resonance, it is efficient at producing that frequency, but 
where it has an antiresonance, it may be inefficient or unableto víbrate at all. When we make 
different vowel sounds with our mouths, we are amplifying certain partíais of the broadband 
waveform generated in thelarynx and attenuating others. By some innatecapadty or long expe- 
rience (or both), our minds associate a certain profile of strong and weak partíais with a 
particular vowel. 

A formant is a group of frequencies of some particular bandwidth that is emphasized by a res- 
onant system. Vowels are vocal formants. Formants may befixed or variable. For example, good 
violíns often have a fixed formant, sometí mes cal led the singing formant, with a band center of 
approximately 1000 Hz. Diphthongs in speech are actualIy formant ranges that shift up and down 
in frequency, emphasizing higher or lower partíais of the sound made by the glottis. 

R esonance i s i nvol ved i n the producti on of sound for vi rtual I y every musi cal i nstrument. A f I ute 
isdriven by the breathy broadband noisecoming from the player's mouth through its fipple. The 



Representing M i 



/ 2/ 3/ 4/ 5/ 6/ 7/ 8/ 9/ 10/ 11/ 12/ 13/ 14/ 15/ 16/ 


Figure 2.27 

Harmonio spectrum and octaves. 

air trapped i n the body of the fIute tends to resonate only at particular frequencies and captures the 
energy from the broadband noise only at thesefrequencies. 

To take a nonmusical example, consider a car driving down a corrugated dirt road. There is a 
certain speed of travel that makes the car shudder the most violently from the corrugations: this 
isthecar's resonantfrequency, that i s, thefrequency at which the most up-and-down energy from 
the wheels passi ng over the corrugations can be transmi tted to the rest of the car and its occupants. 

2.8.7 Overtonesand Octaves 

Asshown in figure 2.19, the harmonic series isa linear factor n times the fundamental frequency, pro- 
ducing a series of harmonics such as f, 2 f, 3 f, 4 f, 5 f, 6 f, 7f,.... T he octave series i s an exponential factor 

2 n ti mes the fundamental: f, 2 f, 4 f, 8 f, 16 f, 32 f, _Figure 2.27 shows the reí ation between partíais 

and octaves. N otice that there are many more harmonics within the compass of the higher octaves. 

2.9 Summary 

Amazingly.weareableto parsediscrete notes outof the ocean of sound surroundi ng us. A nd in spite 
of thefact that we can't di rectly shareour prívate experi enees, we'vedeveloped Symbol ic Systems 
to communicate about many things, including music. Common music notation representa notes as 
pitch, loudness, timbre, onset, and duration. A score is a col lection of notes i n time order. Notes are 
written on a staff, which al so provides clef, key signature, time signatu re, and metronome mark. 

Pitch is how our ears register frequency. Loudness is how our ears register intensity. Timbre 
describes either thekind of instrument making a sound orthe sound's quality. 

I ntervals are characterized by the frequency ratio of two pitches. Intervals inelude the unisón, 
octave, perfect, and imperfect i ntervals, andthedissonances. Scales are madeupof collectionsof 
intervals in particular patterns. The diatonic scale is the prototypeof modern Western music and 
al so the foundation for many other musical Systems in the world. The modes are simply the 



C hapter 2 


diatonic scalestarted on a different degreeof thescale. Thechromatic scalehas 12 semitones per 
octave. Scales can be played on any starting pitch by using sharpsor fíats to preserve the interval 
order. Thereare many other nondiatonic scales besides the chromatic scale, including pentatonic, 
harmonic minor, melodic minor, Hungarian minor, and the whole-tone scale. 

Rhythms are written in terms of how many beats they occupy. Tempo is the beat rate. 

Ti mbre i s the spectrum of frequencies i n a tone. H armonic spectra have an i nteger multi plespac- 
ing between components. Partíais are generated by the vibration modes of the instrument. Static 
spectra average the strengths of the parti ais through ti me; dynamic spectra show each partial ateach 
moment in time. A n amplitude envelope shows the average intensity of all partíais through time. 
T he voice and most i nstruments have resonances that ampl ify or attenuate certai n vi brati on modes. 



Musical Scales, Tuning, and Intonation 


A Iterations of pitch in melodies take place by intervals and not by continuous transitions. We consequently 
f i nd the most complete agreement among al I nations that use music at al I, f rom the earl iest to the I atest ti mes, 
as to the separad on of certain determínate degrees of tone from the possi ble mass of continuous gradations 
of sound, all of which are audible, and these degrees form the scale in which the melody moves. But in 
selecting the particular degrees of pitch, deviations of national taste become immediately apparent. The 
number of scales used by different nations and at different times is by no means small. 

- Hermann Helmholtz, On the Sensations ofTone 1 

Why are musical scales organized the way they are? Why is most Western music based on scales 
made up of seven tones when there are twelve tones per octave? W hat does "equal-tempered" 
mean, and why after all these centuries is it still controversial? What choices have other cultures 
made about intonation, and why? What can we learn about ourselves, our music, and our culture 
by taking a careful look at the underlying mathematics? This chapter examines one of the most 
basic issues of music technology: musical scales, tuning, and intonation. 

Certainly, tones and intervals are the primary materials of music. Virtually all music depends 
upon playing tones in certain intervalsto convey musical ¡deas. A flexible and convenient way of 
describing tones and i nterval s is therefore fundamental, and this constitutes the mai n focus of thi s 
chapter. However, what starts out like a walk in the park becomes a surprisingly twisty trail with 
some deep insights into the choices our culture has made about the music we want to hear. 

3.1 Equal-Tempered Intervals 

Themodern equal-tempered scale isa good placeto begin because it is so ubiquitousand so simple. 
Wecan use it to develop some basic toolsand terminology thatwill lead the way into a widerdis- 
cussion of intonation. 

As described in chapter 2, modern Western instruments divide the octave into 12 equal-sized 
semitones. This System of tuni ng is calIed equal temperament because thefrequencies of alI i nter- 
vals are based on one uniform semi tone ¡nterval. 



40 


C hapter 3 


Wecan useequation (2.2), f x = f R - 2 X , x e R, to computethefrequenciesof theequal-tempered 
scale. For some referencefrequency f R , we obtain thefrequency f k of any equal-tempered interval k 
(k = 0,1,..11) within the first octave by computing 


f k = f R ■ 2 klU . Equal-Tempered Intervals (3.1) 

Forexample,thepitchonesem¡toneabovef fi = 440Hz¡s f l =f R - 2< 1/12 > = 466.16 Hz.Thesize 
of the tempered semitone itself can be expressed as the ratio 
o 1/12 12 ñ 

=y = -^= = 1.05946. Semitone Interval (3.2) 

The nomenclature x Jz meansthexth rootof z, so n j2 is the twelfth rootof 2. 


3.2 Equal-Tempered Scale 

Table 3.1 shows the conventional assignment of alphabetic letters to the frequencies of the 
equal-tempered scale. The table was generated by setting f R = 440 Hz in equation (3.1) and cal¬ 
culad ng the frequencies of all 12 valúes of k. 

A slight modification of equation (3.1) enables us to createequal-tempered intervals outside of 
an octave. In this versión, 

f kiV = f R - 2^12, (3.3) 

f kv is the frequency of equal-tempered interval k in octave v. The valúes of k are the integers 
between Oand 11, and thevalueof vis any integer. N ote that the octaves thatv sel ects are relative 
to the reference pitch, f R . That is, v = 0 selects the same octave as f R , while v > 0 sel ects octaves 
above f R and v < 0 selects octaves below f R . 

This is unfortunately at odds with the common Western practice of naming octaves after the 
order of their appearance on a standard 88-key piano keyboard. In this practice, A440 is in 
thefourth piano octave and henee can al so becalled A4. C4 iscalled middleC in this System. The 


Table 3.1 

Frequencies of the Equal-Tempered Scale 


k 

Ñame 

Frequency (Hz) 

k 

Ñame 

Frequency (Hz) 

0 

A 

440.000 

7 

E 

659.255 

1 

A„Bi, 

466.163 

8 

F 

698.456 

2 

B 

493.883 

9 

F¡> G, 

739.988 

3 

C 

523.251 

10 

G 

783.990 

4 

C„Dt 

554.365 

11 

G„A t 

830.609 

5 

D 

587.329 

12 

A 

880.000 

6 

D,, Et 

622.253 






M usical Scales, Tuning, and I ntonation 


41 


88-key keyboard rangesfrom AO to C8. All we have to do to adopt this practice is to subtract 4 
from the exponent of equation (3.3): 

4,v = f/r 2(v “ 4)+m2 ' (3-4) 

For example, given f R = 440 Hz, the frequency of the pitch A4 is 440 • 2( 4 - 4 >+°/ 12 , and the pitch 
an octave and a semitone above is B^5, and its frequency is 440 • 2< 5 - 4 >+ 1 / 12 . 

3.2.1 Constructingan Equal-Tempered Scale 

To construct an equal-tempered scale, we must 

1. Tie it to a reference frequency like A440 

2. Nametheintervalsofthescale 

3. Calcúlate the frequencies of the intervals from the reference 

Choosing the Reference Frequency Piano keysarenamed by combining their pitch classand 
thei r octave. T he octaves start at 0 at the bottom of the keyboard, and the lowest pitch i s cal led A 0. 
Counting octaves upfrom AO, middleC corresponds to C4. By convention, we use A440 as the 
reference and assign it to the piano key A 4. 

The Reference Octave Now we must establish a reference octave. Here there is a small diffi- 
culty. If the first pitch classin an octave were named A, the first letter in the alphabet, wecould 
use the A440 reference as both the pitch A 4 and the pitch of the start of each octave. But histor- 
i cali y, new octaves beginwith the pitch class C.Why the pitch class A wasnotchosenforthis honor 
isa mystery shrouded in an enigma, butwe'restuck with it. 

Thesolution isto use equation (3.3) to compute the frequency of C4 based on the pitch of A4. 
Then we can use C4 as the reference frequency to deduce all the rest of the frequencies of the 
equal-tempered scale. 

Wecan figure out the frequency of middleC this way: if A 4 is 440 H z, then by equation (2.2), 
A 3 will be 220 Hz. M iddleC isthreesemitonesaboveA3 on the piano. So by (3.3), the frequency 
of middleC is 

C4 = 440 2 23/12 , MiddleC (3.5) 

which isabout 261.626 Hz. To makethefollowing equations a little simpler, let's defineR =C4 = 
261.6 Hz. The purposeof introducing R isto let it stand for the reference frequency no matter what 
actual frequency itis. Forthefollowing examples, wesetthereferenceR to C4, butitcould justas 
easily beany other frequency, and we'll choosedifferent valúes for R when westudy other scales. 

Defining Scale I ntervals UsingreferencefrequencyR, wecan constructall other equal-tempered 
pitches in any octave. To make this slightly moreconvenient, let's define thefunction 

f(k,v) = f ktV , 


(3.6) 



42 


C hapter 3 


where f kv ¡s as defined in (3.4). This function takes two arguments: 

•k is an integer signifying one of the 12 pitch classes from C to B numbered 0 to 11. 

■ v is the desi red octave; octave number 4 corresponds to the fourth piano octave. 

We can defi ne a set of symbols for al I equal -tempered pitches i n al I octaves usi ng equati on (3.6) 
to specify thei r proper frequencies. For example, we can define the chromatic pitches playable on 
a piano as follows: 

AO = f(0, 9), AsO = f(0,10), B0 = f(0,11), C1 = f(l, 0), Csl = f(l, 1),... 

C4 = f(4, 0), Cs4 = f(4,1), D4 = f( 4, 2), E4 = f(4, 3), F4 = f(4, 4), . . . B7 = f(7,11), C8 = f( 8, 0). 

3.2.2 Equal-Tempered Semitoneasa Ratio 

I n discussing equation (3. l),wesawthat i n the equal-tempered scalethe number 1.05946..., which 
corresponds to u j2 , i s the factor by which thefrequency of atonemust be rai sed in orderto obtain 
afrequency onesemitonehigher. A notherwaytosay this ísthatVneinterval of a semitone \s the ratio 
1.05946:1. The advantageof this representad on isthat it is independentof any particular frequency. 
W hen any frequency is multi pl ied by the factor 1.05946..., the next semitone i n sequence i s auto- 
matically produced. For example, if A =440 Hz, then A# = 440 ■ 1.0595 ... and so on. 

3.2.3 Nonstandard ReferenceFrequencies 

Usingtheequal-tempered semitone as a ratio allows for constructionof scaleson nonstandard ref¬ 
erence frequencies as well. For example, we can find a semitone above 450 Hz by multiplying 
450 ■ 1.0595. This can be used to construct equal-tempered scales for antique and nonstandard 
instruments that used this reference frequency. 

The use of A 440 asa standard pitch isa comparad vel y recent development. Agreement is still 
so fragüe among musicians that in 1986 the Piano Technicians Guild, an international nonprofit 
organization of more than 3500 piano technicians, felt compel Ied to adopta resol ution calling for 
continued worldwide acceptance of A440 as the standard pitch. The Guild summarized the situ¬ 
ad o n as follows: 

The history of musical pitch over the last three centuries has been one of confusión and misunderstanding. 
The pitch of A has ranged from 312 hertz used in a seventeenth-century church organ to a high of 464 used 
by some British military bands at the end of the nineteenth century. 

Asearly as 1834, a congress in Stuttgart, G ermany, unsuccessfully attempted to standardize pitch atA-440. 
I n the earl y y ears of thi s century, a number of groups i n the U ni ted States f ormal ly adopted A -440 as a standard 
pitch. 

The United States Bu reauof Weights and M easures adopted A-440 in 1920, and itwas adopted as the world¬ 
wide standard in a treaty signed during an International Standards Association meeting in London in 1939. 

Nonetheless, instrumentalists and orchestras continué to demand alternative pitch references, 
eitherto perform antique music orto satisfy the vanity of a particular virtuoso. 



M usical Scales, Tuning, and I ntonation 


43 


3.3 J ust I ntervals and Scales 

¡ ust intervals are ¡ntervals madefrom the ratio of small whole numbers. The only i nterval that 
¡s just in the equal-tempered scale is the octave, 2/1. But the justscales are based entirely 
on such small whole-number ratios. W hile the creation of scalesfrom small integer ratios is a 
very ancient practice, 2 the equal-tempered scale emerged from the just scales only in recent 
centuries. 

3.3.1 Originsof thej ust I ntervals 

Ordinarily, when we hear a musical instrument, our ears fuse its many harmonics into a single 
perceptthat we identify with the source of thesound. However, if wetreatthe harmonics not as 
elements of a composite tone but as simple individual tones, we can view the harmonic series as 
asetof ¡ntervals. Figure 3.1 shows a harmonic spectrum containing a fundamental atfrequency f 
and five overtones at integer múltiples. The ¡ntervals between adjacent harmonics are simply the 
ratios of thei r frequencies, as shown in the figure. 

I think it's amazing that the most important musical ¡ntervals are embodied in just the first six 
components of the harmonic series. The octave, fifth, and fourth are perfect intervals, and the 
major and mi ñor thirds are imperfect intervals (see section 3.8.2). 

3.3.2 Adding and Subtracting I ntervals 

Wecan useequation (3.1) to add and subtract intervals. If x = 2 in that equation, then frequency f x 
will betwo octaves above frequency f R . By the distributive law, wecan rewritethis as 

f x = f R ■ 2 1 ■ 2 1 


Frequencies: f + 2f + 3f + 4f + 5f +6f 


Ratios: 

2 

1 

3 

2 

4 

3 

5 

4 

6 

5 








I ntervals: ¡B il § m m 

8 " £ ° 8 

2 k 


Perfect I mperfect 

Intervals Intervals 


Figure 3.1 

I ntervals of the harmonic series. 



C hapter 3 


So to add two octaves to f R , we multiply it by the octave ratio 2/1 twice. This suggests that 

Intervals are added by multíplying their ratíos. 

Let's test it. The sum of a fifth pl us a fourth should be an octave. If we multi ply the ratios of the 
fifth and fourth: 

3 4 2 
2 3 1' 

the result is indeed an octave. 

If multíplying ratios corresponds to adding intervals, then dividing ratios should correspond to 
subtracting them. From the example we'd expect that subtracting a fifth from an octave should 
yield a fourth, and indeed 

2 _ 3 _ 4 
1 ' 2 3' 

So it follows that 

Intervals are subtracted by dividing their ratíos. 

These rules are a consequence of the exponential relationship between pitch and frequency. 
Subtracti ng an interval from an octave produces its inversión. Thus, in the previous example, 
the fifth and the fourth intervals are each other’s inversions. 

Wecan add or subtract an i nterval to orfrom itself n times simply by raising its ratio to the power 
of n (wheren is an i nteger). For example, (2/l) 2 = 4 ascends two octaves, and (l/2) 2 = 1/4 descends 
two octaves. Similarly, (3/2) n ascends by n fifths, and (2/3) n descends by the same amount. 

If we add or subtract an ¡nterval to every pitch ¡na score, we transpose that score. For example, 
to raiseamelody by a fifth, multiply the ratios of al I itspitchesby 3/2. To lower it by a fifth, mul¬ 
tiply the ratios of all its pitches by 2/3 = (3/2) 1 . 

3.3.3 J ust Pentatonic Scale 

The simplest just scale— one that seems to exist in every human culture— is the ¡ust pentatonic 
scale. It is very consonant because it has no minor second. We can get a reasonably good idea of 
whatthis scale sounds like by playing only the black keys of a piano. Flowever, the original just 
pentatonic scaleswerebased on ratios ofsmall integers, noton thehomogenized divisions of the 
octave given by the equal-tempered scale as used in pianos. 

Thejust pentatonic scalecan be constructed enti rely f rom the i nterval of the fifth (3/2). Flowever, there 
isa more intuitive way of constructing this scale, involving the fifth and its inversión the fourth (4/3): 

1. Startwithsomepitch, suchasC. 

2. M ultiply the frequency of C by 4/3 to find the frequency for F. 

3. MutipiyC by 3/2 to find the frequency for G. 



M usical Scales, Tuning, and I ntonation 


C F G 


1 4 3 

2 3 2 



Figure 3.2 

Pentatonic scale, first step. 


C D F G A (C) 

1 9 4 3 27 2 

2 8 3 2 16 1 


Figure 3.3 

Just pentatonic scale. 

Sofarwehavethreepitches, C, F, and G (figure 3.2). Wecreatetheremaining two pitchesof the 
scale, D andA,fromtheoneswehaveso far. 

4. To get D, go down a fourth from G. If the upward-going fourth is 4/3, the downward-going 
fourth is3/4. Expressed in ratios, D = (3/2) ■ (3/4) ■ C, which simplifiesto D = 9/8 ■ C. 

5. To get A, go up a fifth from D: A = (9/8) ■ (3/2) ■ C , which equals (27/16) ■ C . N otice that 
the i nterval (27/16) ■ C isa majorsixth up from C. 

The full pentatonic scale is shown in figure 3.3 with the octave added. 


3.4 The Cent Scale 


The cent scale isa simple means for comparing the size of intervals. 3 W here the equal-tempered 
chromatic scale divides the octave into 12 degrees, the cent scale divides the octave into 1200 
degrees, suppl y i ng 100 ti mes thepitch resol uti on of the equal -tempered chromati c scal e. R ecal I i ng 
the definition of thesemitonegiven in equation (3.2), wecan define the interval of 1 cent as 

2 1/1200 = 1.0005778 . Cent (3.7) 

Asaconsequence, onesemitoneisexactly 100 cents. Thepitch distancebetween adjacentcent 
i ntervals is not noticeable to the ear (see sections 6.4.3 and 6.4.5). So, the cent scale serves as a 
pragmatic way to compare any musical intervals regardless of how the i ntervals are derived. 

If r is an interval, then the cent size c of that interval is 



1200 ■ 


iogi 0 r 
i°g 10 2' 


Centlnterval (3.8) 



46 


C hapter 3 


where log 10 x is the logarithm base 10 of x (see appendix A). For example, consider r =2/1, the 
octave. Then we have 

c = 1200 ■ j^lo2 = 12 oo. 

loQio 2 

Let’s usethisto compare the ratiosof thejustfifth, 3/2, and thetempered fifth, 2 7712 . Thetempered 
fifth is exactly 700 cents. By (3.8) thejustfifth is 701.955 cents. So thetempered fifth is almost 
2 cents fíat of a perfect fifth. 

To go the other direction from an interval in cent to a ratio. 


r = 1( / 1200/ ' 0 9io 2 ) 

Trivially, if c = 1200, r =2/1, the octave. 


Inverse Cent (3.9) 


3.5 A Taxonomy of Seal es 

I n order to talk sensibly about all kinds of seales, Iet’s define the dodecaphonic scale as any seale 
with 12 degrees. Dodeca is G reek for "twelve." Then the equal-tempered scale, al so known as the 
chromatic scale, is justa kind of dodecaphonic scale. 

Similarly, I et the heptatonic scale be any scale with seven degrees. B y this defi nition, the scale 
made by the white keys of the piano is the equal-tempered heptatonic scale. The diatonic scale 
(see section 2.4.2) is a heptatonic scale with a particular order of scale degrees. 4 Similarly, the 
pentatonic scale is any scale with five degrees, and the black notes on the piano are 
the equal-tempered pentatonic scale. Any pentatonic scale built on just ratios is an instanceof the 
just pentatonic scale. 

With these definitions in place, a simple taxonomy of seal es can be based on the number of 
degrees and whether the scale System istempered or just (table 3.2). 


Table 3.2 

Simple Taxonomy of Scales 


Intonation 


Equal-Tempered 


Pentatonic 

Heptatonic 

Dodecaphonic 


J ust pentatonic 
J ust heptatonic 
Just dodecaphonic 


Equal-tempered pentatonic 
Equal-tempered heptatonic 
Equal-tempered dodecaphonic 




M usical Scales, Tuning, and I ntonation 


47 


3.6 Do ScalesComefrom Timbreor Proportion? 

In section 3.3.1, wesaw how the perfect intervals and the major and minor thirds areall present 
in thefirstsix partíais. This issuch a stri king coincidencethat it has led someto wonder if perhaps 
the goal of theearly musicengineersmighthavebeen tofashion scalesfromtheseratios. I cali this 
the deductive scale conjecture— that scales were deduced from the nature of the harmonios. This 
conjecture is disputed by some. In his book Génesis of a Music (1947, 87), Harry Partch States, 
"Long experience... convinces me that it is preferableto ignore partíais as a source of musical 
materials. Theear is not impressed by partíais as such. Thefaculty— the primefaculty- of the ear 
is the perception of smal l-number i ntervals, 2/1,3/2,4/3, etc. and the ear cares not a whit whether 
these intervals are in or out of the overtone series." 

The earliest known research in the West on musical scales was conducted by Pythagoras 
(ca. 580-500 b.c.e.) and hisfoll owers. We know that the Pythagoreans vi ewed musi c as a branch 
of Science and bel ieved that the construction of musical scales should proceed outof an analogical 
process that related, for example, the periodic movements of a string to the periodic movements 
of the planets. They weighed the distances between planets the same way they weighed the divi- 
sions of a musical string, namely by thestudy of ratio and proportion. Figure 3.4 shows an inter- 
pretation by Robert Fludd (a contemporary of Johannes Kepler) of the relation between the 
harmony of the spheres and the proportional divisions of a string. 5 



Figure 3.4 

The cosmic monochord of Robert Fludd. 



C hapter 3 


F rom the Pythagorean perspective (shared by Partch), the i mportant thing about a musical scale 
is its proportionality— how it divides up the unity of a string— not the relationship between that 
proportionality and any physical artifact such as the overtone series. 

F rom thi s evi dence, one mi ght argüe that scal es devel oped out of the mathemati es of proporti on. 

I cal I this the i nductive scale conjecture— that scal es are a free creation of the human mind, based 
on rati o and proporti on. According to thisconj ecture, the scal es are PI atoni c archety pes, and phys- 
ical musical instruments are imperfect instances of these archetypes that are manifested in the 
world by way of human creativity. 

Ofcourse,theseareonlyconjectures.Thetruth of how the scal es actual I y devel oped i s I ost i n the mi sts 
of time. A re the scal es derivad ve of the overtone series or derivad ve of mental constructions of propor¬ 
tionality? Is the prime faculty of the ear the perception of small-number intervals or the perception of 
harmonios? I argüe i t both ways i n thi s chapter because there is pl enty of evi dence forboth perspectives. 

It is evident that musical scal es are free creations of the human mind because they do not occur 
in nature. It isat leastastriking coi nci dence that they align i n their principal dimensions with the 
harmonio sequence. Perhaps it was the very numinosity of this coincidence that compelled the 
Pythagoreans to study this subject in thefirst place. 

3.7 Harmonio Proportion 

Pythagoras is credited by ancient G reek writers with having discovered the intervals of the octave, 
fifth, fourth, and double octave (4/1). Pythagoras and his followers attached great numerological 
si gnif i canee to the fact that these most harmoni ous i nterval s were constructed stri ctly f rom rati os 
of theconsecutiveintegersl, 2,3, and 4. They were al so impressed by the fact that these i ntervals 
formed a sequence of superparticular ratios, that is, ratios of the form ( n + l)/n :2/1 (octave), 
3/2 (fifth), and 4/3 (fourth). They found mystical signifi canee i n the fact that by thei r nature super¬ 
particular ratios pairan even and an odd number. They al so noted that small i nteger superparticular 
ratios seemed to be the most harmonious. These observations became permanent fixtures in the 
minds of music theorists for the next two thousand years. 

T he means Pythagoras used to construct hi s scal e can be stated asfollows. H e started w i th a di vi¬ 
sión of the string into 12 equal parts. 

1. The octavéis the rati o 12:6. 

2. The fifth is found by taking th earithmetic mean of the octave, defined asx = (a + b)/2. Thus, 
(12 +6)/2 =9, and the ratio 12:9 =3:2 is the fifth. 

3. The fourth is found via the harmonic mean, defined as x = 2 abl(a + b). Thus, (2 ■ 12 ■ 6)1 
(12 + 6) = 8, and the ratio 8:6 = 4:3 is the fourth. 

Pythagoras combined these results into what he called the harmonic proportion, 

12:9:: 8:6, (3.10) 

which hetook to bethefoundation of all music. 



M usical Scales, Tuning, and I ntonation 


3.8 Pythagorean Diatonic Scale 

The scalethat eventualIy carne to be associated with Pythagorasaddstwo morepitches, E and B, 
to thejust pentatonic scale to produce the Pythagorean diatonic scale. Although it can be built 
entirely from fifths, using its inversión thefourth helps keep its construction simple. 

1. Construct a pentatonic scale. 

2. Add pitch E by going down a fourth from A: 

p _ 27 3 = 81 

16 4 64' 

3. Add pitch B by going upafifthfrom E: 

n _ 81 3 = 243 

64 2 128' 

The Pythagorean scale is shown in figure 3.5. 

Wecan createasetoffunctionstoproducethefrequencies of the Pythagorean diatonic scalejust 
as we did for the equal-tempered scale (see section 3.2.1). As before, we need a reference fre- 
quency, a reference octave, and the intervals. 

1. Startfrom A440. The reference frequencyR =440 Hz. 

2. B uild the scale so that when v = 4 frequencies are in thefourth piano octave. Wewanttocreate 
afunction that takes the octave v as itsargumentand gives Pythagorean C in any octave. How do 
we go from A440to Pythagorean C? The answeris i nfigure3.5. Wesubtract the interval of amajor 
sixth,thedistancefromA down to C, by multiplying A by 16/27: 

C K (v) = R ■ ^ ■ 2 V ~ 4 . (3.11) 

Because weare using integer ratios, weend up with a differentfrequency for middle C than the 
equal-tempered scale (260.741 Hz). I introducethenotation to distinguish the"Tcthagorean” C 
from the equal-tempered C. Pythagorean middle C is 

C D E F G A B (C) 

1 9 81 4 3 27 243 2 

2 8 64 3 2 16 128 í 



Figure 3.5 

Pythagorean scale. 



50 


C hapter 3 


3. Finally, create interval frequency functions: 

F*(v) = 

G n (v) = C n (v) ■ |, 

D*(v) = C n (v) '|. 

where v i s the desi red octave. 

Hearing early music played with just intervals can sound transcendental!y beautiful, especially 
if the intervals are played accurately. M usic in the M iddle Ages was mostly written using the 
Pythagorean scale, and the just ratios seem to lend this music a refreshi ng, crisp ai r. 

But there aretwo significant problems with the Pythagorean scale that musicians have histor- 
ically disliked: someof its i ntervals are not musical ly pleasing becausethey do notalign with the 
harmonio series, and it is awkward to transpose. 

3.8.1 I ntervals of the Pythagorean Diatonic Scale 

Figure3.6 shows the Pythagorean scale with intervalsbetween the pitches. Thetoprow showsthe 
intervals built upfrom C n . The bottom row shows the sizes of the ¡ntervals, that is, the difference 
between adj acent intervals. Recal I that i ntervals are subtracted by dividing their ratios. Forexam- 
pl e, the i nterval of thewholestep C:D is 

9/8 
1 / 1 ' 

Thewholestep D:E is 

81/64 = 9 
9/8 8' 

The half step E:F is 

4/3 = 256 
81/64 243' 

The rest of the intervals folíow this pattern. 

3.8.2 TheSyntonic Comma 

The ¡nterval ofthethird in the Pythagorean scale was consi dered a dissonancein the M iddle Ages, 
and asa result compositions would typically omitthethird in the final chord of acomposition so 
as to end only with perfect intervals— fourths, fifths, and octaves— an effect that sounds hollow 
to modern ears. 



M usical Scales, Tuning, and I ntonation 


51 


1 2 3 4 5 6 7 8 

1 9 81 4 3 27 243. 2 

1 8 64 3 2 16 128 I 



Figure 3.6 

Pythagorean scale with intervals. 


The reason the third was considered dissonant is that all the Pythagorean major thirds (C:E, 
F:A, and G:B) use the 81/64 ratio, which is not the same as the 5/4 major third that occurs nat- 
urally i ntheovertone series. The three Pythagorean major thirds are a littlesharp of the5/4 major 
third; henee they don't line up perfectly with theovertones of harmonio instruments, causing a 
roughness in the sound because of beats (see section 6.7). This imperfection in the otherwise 
beautifully symmetrical edifice of the Pythagorean scale was irritating enough to be given a 
ñame. The ratio of 

81 _ 5 = 81 
64 ' 4 80 

is the Syntonic comma, also known as the comma of Didymus. It is the amount by which 
the Pythagorean major thirds are out of tune with the 5/4 major third of the overtone series. 
The Pythagorean major third is about 21.5 cents Sharp, about a fifth of a semitone, which 
is easily noticed. The same problem afflicts the Pythagorean minor third, the major and 
minor sixths, and the major seventh and minor second. Only the perfect intervals are exactly 
aligned with the overtone series. Perhaps this is where the nomenclature of "perfect/imperfect" 
originated. 

3.9 T he Problem of Transposing J ust Scales 

Suppose we have a song that was arranged for a high female voice, but we only have a low female 
voiceavailable. Unless, trivial ly, wecould justdrop thepitch of the song an enti re octave to solve 
the problem, itis necessary to transpose the music by some interval so that it I ies within the avail- 
abl e vocal i st's range. If all we have isthediatonic Pythagorean scale, we have only two less-than-ideal 
work-arounds: 

■ Retune all accompanying instruments to a new referencefrequency R. 

■ Transpose to a different key within the Pythagorean scale. 

Retuning instruments is at least nontrivial, and for some instruments impossible, and is to be 
avoided. So the only real i sti c alternative is transposition. 



52 


C hapter 3 


To achieve a transposable tuning system, one might naively think that all we must do is extend 
the Pythagorean scale to 12 degrees using the method of adding and subtracting intervals. Then 
one could transpose music to any chromatic degree as we do with the modern equal-tempered 
scale. Let us test this ¡dea by constructing thedodecaphonic Pythagorean scale. 

3.9.1 Pythagorean Dodecaphonic Scale 

All the intervals in the Pythagorean dodecaphonic scale can be generated from the interval of the 
fifth (3/2) raised to integer powers. 

1. Beginning with (3/2)° = 1, labeled C, ascend and descend by six fifths in both directions. The 
spel I i ng of the scal e degrees (w hether they are Sharp, fl at, or natural) i s determi ned by the di recti on 
of ¡nterval movement. Since westart ate, we move up a fifth to G, and so forth. Eventually the 
¡nterval of a fifth above B is F». Similarly, going down by fifths from C, the fifths below F are Bt, 
At, and so forth. N ote that at the extremes we have a low Gt at 64/729 and a high F» at 729/64. 


Powers: 

Í-T 

í-í 5 

Í-Í" 

Í-Í" 

í-í 2 

Í-T 

ÍÍT 

í-í 

í-í 

Í-T 

í-í 

Í-T 

Í-T 

{2) 

l2j 

l2j 

l2j 

l2j 

I 2 J 

12J 

12J 

I 2 J 

I 2 J 

l2j 

I 2 J 

12 J 

Ratios: 

64 

32 

16 

8 

4 

2 

1 

3 

9 

27 

81 

243 

729 

729 

243 

81 

27 

9 

§ 

i 

2 

4 

8 

16 

32 

64 

Degrees: 

Gt 

Dt 

At 

Et 

Bt 

F 

c 

G 

D 

A 

E 

B 

F» 


2. Add or subtract octaves from these intervals until they lie within the compass of one octave 
(remembering that adding intervals is multiplying their ratios). 


64 2 4 

32 2 3 

162 3 

8 2 2 

42 2 

22 

1 

3 

91 

271 

811 

243 1 

729 1 

729 1 

243 1 

811 

271 

91 

31 

1 

2 

42 

8 2 

162 2 

32 2 2 

64 2 3 

1024 

256 

128 

32 

16 

4 

1 

3 

9 

27 

81 

243 

729 

729 

243 

81 

27 

9 

3 

1 

2 

8 

16 

64 

128 

512 


3. A rrange the intervals in ascendí ng order of magnitude, and add the unisón and octave. 

Observe in figure 3.7 thatthe dodecaphonic Pythagorean scale contains within itall the intervals of 
the just pentatonic scale and the Pythagorean diatonic scale. This shows thatthe interval of the fifth 
underlies all of these scal es. This method of generad ng fifth-based just scal es can be extended to any 
numberof degrees. I nterestingly, themagnitudeof theratiofor F*(1.42) makes itsharperthan G[.(1.40). 

N ote that there are actualIy 13 degrees i n this scale as constructed, because we have two ki nds 
of tritone intervals that are slightly different (F# and Gt). In the equal-tempered scale, theaugmented 
fourth F»and diminished fifth Gi,areequal (see section 2.5), but in the Pythagorean dodecaphonic 
scale they arenot, and itisambiguouswhich should serve as the tritone. Onsomehistorical key- 
board instruments, the black key between F and G was actual ly split in two, with F» on one side 
and G[, on the other, rather than throwi ng one of them out. M ore often, one or the other was si mply 





M usical Scales, Tuning, and I ntonation 


§ b b o b 

I IHsHISsS 
C D[, D E[, E 

1 256 9 32 81 

I 243 8 27 64 


Figure 3.7 

Pythagorean chromatic scale. 



F G\, F| G 

4 1024 729 3 

3 | 729 512 2 

T ritones 


p b b b | 

ir 'ST r i r ar r ij 

28 28 2Í O 

Ai, A Bi, B C 

128 27 16 243 2 

81 16 9 128 I 


Q o 

i S fs 

Ek E 

32 81 


ñí 

Gt 


243 8 27 64 

2187 256 2187 256 256 
2048 243 2048 243 243 


G A|, A B|, 


b 

'ir r 

s ¡5 


128 27 16 243 


531,441 

524,288 


2187 2187 

2048 2048 


Figure 3.8 

Chromatic Pythagorean scale with intervals. 


leftout. Butif F»¡sleftout, thefifth between Gi,and B isnotajust3/2 fifth. Itiscalled a wolffifth 
because the beating between the interval and the overtones makes it sound unpleasantly like 
wolves howling. And if Gi, is left out, some of the thirds and sixths are not harmonious either. 

Thetritone wascalled by medieval music theoriststhe diabolusen música, "thedeviI in music," 
not just because of its dissonant sound but because of the ambiguity of its ratios and the enormous 
numeric sizes of those ratios. 

4. For the final step, determine the interval sizes by subtracting the lower interval from its upper 
neighbor (remembering that subtracting intervals is dividing their ratios). 

Notice in figure3.8 thattherearetwo semitone intervals, a smaller interval with ratio 256/243, 
called the Pythagorean diatonic semitone, or limma, and a larger interval with ratio 2187/2048, 
called the Pythagorean chromatic semitone, or apotome. The ratio of these two semitones is 

2187 _ 256 = 531,441 
2048 ‘ 243 524,288 ' 



54 


C hapter 3 


thePythagorean comma. Thedifference between thesetwo semitones is23.46 cents, about afifth 
of a tempered semitone. N otice thatthis is also the ratio between Gi, and F», the two tritones. Coin- 
cidently, this is also the amount by which the interval of 12 fifths differs from seven octaves as we 
wiII subsequently see. 

The intervals from F to F» and from Gt to G both use the 2187/2048 semitone. 

Studying figure 3.8, we see that if we could only get rid of that pesky Pythagorean comma and 
somehow makeG|, = F*, wewould have a self-consistent circular scale System builtoutof just 
rati os. T hen i t woul d be possi bl e to transpose to any key and remai n in tune. This possi bility under- 
Iies the entire motivadon for the development of tempered tunings. 

3.9.2 I mpact of Polyphony on J ust Scales 

Besides bringing music into a more playable range, transposition has become a powerful organiz- 
ing principie in music over time. Throughout the last eight centuries. Western composers have 
become i ncreasi ngly enamored of polyphony, the art of soundi ng more than one mel ody I i ne at the 
same ti me. I n the process, they have sorted out which combi nations of pitches sound good together 
and which don't, andfigured outhow to harmonize múltiple musical Unes and chords. Outof this 
aros eharmonytheory, which is the art of arrangi ng múltiple concurrent musical I ines to reinforce 
a feel i ng of harmoni c movement and arrival, suspensi on and resol uti on. M ost d assi cal musi c, and 
virtually all popular music, still follows rules of harmony firstsetdown centuries ago. 

T he effecti ve key si gnature of a musi cal work can change through the i ntroducti on of acci dental s 
not in the original key signature. This is musical modulation. For example, a melody started in the 
key of C major might modulateto the key of G major by introducing F», and then eventually mod- 
ulate back to C major by reintroducing F^ (see section 2.5.5). M odulation became an important 
organizing principlefor music in theB aroque and I ater eras. O ver ti me, composers sought to mod¬ 
ulateto remóte keyswith more sharps and fíats. But the irregular i nterval sizes of the dodecaphonic 
Pythagorean scale limited music from being freely transposableto arbitrary keys because playing 
music in somekeys sounded better than in others. As modulation became increasingly important 
to composers, the need for freely transposable tuning Systems became urgent. Theorists began 
searching for Solutions to the problems of the Pythagorean scale. 

3.9.3 Natural C hromatic Scale 

It has been well known to music theorists from antiquity that if leftto their own devices, singers (and 
other performers, if their Instruments would allow) eschew the Pythagorean thirds and sixths where 
possi ble and prefer i nterval s that align with the harmonio series to improve the sonority of the perfor¬ 
mance. As early as the second century, the G reek scientist, mathematician, and geographer Claudius 
Ptolemy proposed a just intonation system that would reflect what musicians actually played. 6 

Following Ptolemy’s lead, let'sfind out just how farfrom the 5/4 major third the Pythagorean 
major third actually is. The answer is 

815 = 81 
64 ' 4 80 ' 



M usical Scales, Tuning, and I ntonation 


theSyntoniccomma. 7 Then how far is the Pythagorean minor third from the6/5 minorthird?The 
answer is 

6 _ 32 = 81 
5 ‘ 27 80 ' 

again theSyntonic comma. In fact, theout-of-tune Pythagorean major and minor sixths as well as 
the too-sharp major seventh and too-flat minor second are all exactly one Syntonic comma away 
from rati os of much smal I er i ntegers that are i n the overtone seri es and have a more agreeabl e sound. 

What if we subtracted a Syntonic comma from all the Pythagorean intervals that are too Sharp 
and added itto the ones that are too fíat? This would rectify all the intonational difficulties of the 
Pythagorean scale in onefell swoop. M athematically, we'd substitute rati os of much smaller ¡nte¬ 
gers, and musical ly we’d align the scale degrees with the harmonic series. Ptolemy cal led this the 
Syntonic diatonic scale (table 3.3). The Pythagorean diatonic scale and the interval differences 
between the two scales are shown in the table. 

Ptolemy's practical concern in designing this scale wasto makethe i ntervals agree with musical 
practi ce. B ut he al so noted approvi ngly that the ratios of the scal e are al I superparticular ratios (see 
section 3.7). Ptolemy combined thebestof both worlds: a practical scale that al so contai ns more 
superparticular ratios than does the Pythagorean scale (Berkert 1972). 

The chromatic versión of this scale is shown in table 3.4, together with the dodecaphonic 
Pythagorean scale. The third row shows the interval differences between them. I cali this the 


Table 3.3 

Ptolemy's Syntonic Diatonic Scale 


c 

D 

E 

F 

G 

A B 

(C) 





Syntonic diatonic l 


5 

4 


5 15 






Pythagorean diatonic 1 

í 

9 

8 

81 

64 

4 

3 

3 

2 

27 243 

16 128 

2 

I 





Difference i 

I 

1 

í 

80 

81 

1 

í 

1 

í 

80 80 

81 81 

1 

I 





Table 3.4 

The Natural Chromatic Scale 

1 

Semitone C 

2 

Ct 

3 

D 

4 

Et 

5 6 

E F 

7 8 

F| G 

9 

Al, 

10 

A 

11 

Bt 

12 

B 

(13) 

(C) 

Natural chromatic | 

16 

15 

9 

8 

6 

5 

5 4 

4 3 

64 3 

45 2 

8 

5 

5 

3 

16 

9 

15 

8 

2 

I 

Pythagorean dodecaphonic | 

256 

243 

9 

8 

32 

27 

81 4 

64 3 

729 3 

512 2 

128 

81 

27 

16 

16 

9 

243 

128 

2 

í 

Difference 1 

1 

81 

80 

1 

I 

81 

80 

80 1 
81 í 

32,768 1 

32,805 í 

81 

80 

80 

81 

1 

1 

80 

81 

1 

I 





C hapter 3 


Pythagorean 

chromatic 

scale 

Natural 

chromatic 

scale 



Figure 3.9 

Pythagorean chromatic and natural chromatic scales compared. 


natural chromatic scale. It was championed by Bartolomé Ramos (1482). Figure 3.9 provides a 
visual ization of thedifferences. 

Forvariousreligiousand political reasons,Ptolemy’sproposal was i gnored andeven suppressed 
during the next dozen centuriesor so. Pope John XXII even issued a papal bull in 1324 that ban- 
ished from thechurch music using such lascivious intervals (seeappendix A). 

The natural chromatic scale sounds very consonant. But ultimately it fares no better than the 
Pythagorean scalefor moduladon and transposition. Considerthefifth from D to A, which is 

5^9 = 40 
3 8 27' 

about 21.5 cents fíat of the 3/2 perfectfifth.A triad builtonD certainly sets the wolf tones howling. 


3.10 C onsonance of I ntervals 

l've said that the intervals signify such qualiti es as identity, equality, and i ndividual ity (see sec- 
tion 2.3.3). A nother i mportantway wecharacterizethe ¡ntervals isby how pleasing or disagreeable 
theirsound is to us. While some ¡ntervals are harmonious, others, such as the wolf fifth, setour 
teeth on edge. Table 3.5 shows the just ¡ntervals ordered from most to least pleasant, based on the 
conventionsof Western music theory. The musical term for "pleasant" is consonant, which comes 
from Latin consonare, "sounding well together," Theintervalstoward thetop of table 3.5 are con¬ 
sonant; the ¡ntervals toward the bottom are dissonant. 

3.10.1 Foundationsof C onsonance 

W hat i s the basi s for the effect of consonance or di ssonance? I s i t somethi ng i nherent i n the i nter¬ 
vals, or is it i n our perception? If we believeconsonance is in the intervals, weshould examine their 
mathematical properties. If we believe that consonance is in our perception, we should examine 
how we hear the ¡ntervals. I take up the latter approach in chapter 6. hiere let’s pursuetwo ques- 
tions: Is there a mathematical basis for the ordering of ¡ntervals from consonant to dissonant? Is 



M usical Scales, Tuning, and I ntonation 


Table3.5 

J ust Intervals Ordered by Decreasing Consonance 


Ñame 

Ratio 

Sum 


Prime Factor 

Limit 

Perfect 1 ntervals 

1 Unisón 

1/1 

1 + 1 

= 2 

1 


2 Octave 

2/1 

2 + 1 

= 3 

2 

3-limit 

3 Fifth 

3/2 

3+2 

= 5 

3/2 


4 Fourth 

4/3 

4+3 

= 7 

2 2 /3 


Imperfect 1 ntervals 

5 Majorsixth 

5/3 

5+3 

= 8 

5/3 


6 M ajor third 

5/4 

5+4 

= 9 

5/22 


7 Mi ñor third 

6/5 

6+5 

= 11 

(2-3)/5 


8 Minorsixth 

8/5 

8+5 

= 13 

23/5 


Dissonant ¡ntervals 

9 Majorsecond 

9/8 

9+8 

= 17 

3 2 /23 

5-limit 

10 Majorseventh 

15/8 

15 + 

8 = 23 

(3-5)/23 


11 Minorseventh 

16/9 

16 + 

9=25 

2^/32 


12 Minorsecond 

16/15 

16 + 

15=31 

24/(3-5) 


13 T ritone 

64/45 

64 + 

45 = 109 

26/(32-5) 



there a mathematical basis for the categorization of the ¡ntervals into perfect, imperfect, and 
dissonant? 

A successful metric of consonance must 

■ Decrease monotonically in proportion to increasing dissonance 8 

■ Self-evi dently partiti on i nterval s i nto the reí evant categori es, such as perfect, i mperfect, and di ssonant 

Can we discover or invent an analysis of the traditional interval order (table 3.5) that explains the 
order and elassificatión numerically? 

Concurrence Giovanni Battista Benedetti (1530-1590) is perhaps the first to relate pitch and 
consonance to f requenci es of vi brati on. I n two I etters he w rote around 1563 to composer C i pri ano 
deRore, herelated ¡nterval consonance to the frequeney of wavecoincidence between two tones. 
He observed that an ¡nterval consists of a shorter wavelength (higher pitch) and a longer wave- 
length (lower pitch), and argued that the wavelengthsof more consonant i ntervals coincide more 
often than do those of more dissonant ¡ntervals. 

Let's cali the time required for the waveforms of an ¡nterval to coincide its precession time. 
F or exampl e, if one bi eyel e wheel requi res two seconds to tum once around and another requi res 
three seconds, their f requenci es form the interval of a fifth, 3/2, and the wheels precess against 
each other (that is, the faster one overtakes the slower one) every 2-3 = 6 seconds (figure 3.10). 
Benedetti's hypothesis is that consonance decreases as precession time increases. When the 



C hapter 3 



Figure 3.10 

Precessionof 2 against 3. 

intervalsareordered by this criterion, theirsequencefrom consonantto dissonant is 2:1, 3:2,4:3, 
5:3,5:4,6:5,7:5,8:5, and soon (figure 3.11). Notethatthese ratiosarenotstrictly superparticular 
and that by Benedetti’s metric, the unused interval 7:5 is more consonant than the major sixth. 

Benedetti'stheory challenged two ancientdogmas. First, histheory suggested that consonance 
and dissonance are relative, not categórica!, terms. Second, histheory i mpl i ed that superparticular 
ratios were not somehow tonally superior to other ratios. 

Benedetti’s ideas were later developed by Isaac Beeckman (1588-1637) and by M arin 
M ersenne (1588-1648) in Harmóme Universelle (1635). Benedetti's approach shows an orderly 
progressionfrom consonance to dissonance, so it passes our first criterion for consonance. Butit 
does not suggesta way to partition the intervals into perfect, imperfect, and dissonant; indeed, it 
predicts that there is no such criterion. 

Additive Dissonance Metric TheSumcolumnintable3.5showsthesumsofthenumeratorand 
denominatorof theratio of each interval appearing in the Ratio column. For instance, theratio of 
thefifth is 3/2, and 3 + 2 = 5. 

This additive dissonance metric is monotonically related to dissonance. Figure 3.12 plots the 
interval number ordered by dissonance (in the order given in the first column in table 3.5) from 
unisón to mi ñor second on the x-axis against the sum of each numerator and denominator on the 
y-axis. T he curve takes a signifi cant j ump upward from the mi ñor second (31) to the tritone (109), 
so I indicated the tritone to theside rather than plotting it. The fitted curve in the background is 
just an aid to helpjointhe points. 9 

Because this additive dissonance metric increases monotonically with increasing dissonance, it 
meets the first criterion foradissonancemetric. Flowever, because the curvéis gradual (until itgets 
to the tritone), it does not suggest how to partition the intervals into perfect, imperfect, and dis¬ 
sonant, so it fails the second criterion. 

Partitioning Dissonance M etric Any whole number greaterthan 1 can befactored into a prod- 
uct of primes raised to powers, for example, 8 = 2 3 ,47 = 47 1 , 48 = 2 4 ■ 3 1 ,49 = 7 2 . 

Prime numbers are whole numbers greater than 1 that are not divisible by any other number 
besidesthemselvesand 1. (By convention, 1 itself i s not consi dered to be prime.) For example, 2, 
3, 5, and 7 are primes, but 4, 6, 8, and 9 are not because at least one prime divides them evenly. 
Si mi larly, 47 is prime, but 48 and 49 are not. 



M usical Scales, Tuning, and I ntonation 



Figure 3.11 

Precession timefor various intervals. 


T ritone: 109 



Figure 3.12 

Addi ti vedi ssonance metri c. 



C hapter 3 


The Prime Factor column in table 3.5 shows each interval as a ratio of the products of prime 
numbers raised to powers. For example, the major second ratio is 9/8, therefore the prime factor 
of the major second i s 3 2 /2 3 . N oti ce that the more dissonant i nterval s tend to i nvol ve Iarger pri mes 
and higher powers. The perfect i ntervals involve only the smal I pri me numbers 2 and 3. The more 
d i ssonant i nterval s ¡nvolve 5 as well (with the exception of the major second and minor seventh). 
None of thejust ¡ntervals shown in table 3.5 use 7 orthe higher primes. 

Inspiteof its limitations, there seemsto besome historical justification forthis metric. The per¬ 
fect ¡ntervals—those built from primes 2 and 3 only—were the first ones favored by early scale 
builders. Ratios of prime factor 5 began appearing around 400 b.c.e. The exclusión of primes 
higher than 5 to build musical ratios is called the five-limit by the composer Harry Partch in his 
book Génesis ofa Music (1947). The five-limit has only been transcended in recent centuries. 
Partch used an eleven-limit system of ratios in the construction of his scales. Thesedays, if a scale 
issaid to be n-limit, this means that the highest prime factor of any ¡nterval in the scale is n. 

A ttempts to order and el assify consonance usi ng strictly numeri c rul es are fi ne as far as they go. 
But while we generally agree as to the consonance of the perfect ¡ntervals, opinions vary widely 
as to the reí ative consonance or di ssonance of the others, and no one metric seems to sum i t al I up. 

Consonanceappearsto beinfluenced, butnotdetermined, by underlying psychophysical prin¬ 
cipies weall share. It seems as well to be a matter of taste decided differently by each musical cul¬ 
ture and each age. The harmonies in the chórales of J. S. Bach, for example, do not strike the 
modern ear as particularly dissonant; however, I isteners of his age sometí mes found them shock- 
ing. A similar progression has occurred with the music of M ozart, Beethoven, Wagner, M ahler, 
Debussy, Stravinsky, Schoenberg, among others. So where ¡ntervals are concerned, it seems that 
familiarity breeds consonance. 

Its highly contextual naturesuggeststhat attemptsto classify consonance without regard to the 
fundamentáis of auditory perception aredoomed. So let'sdeferfurtherjudgment until chapter 6. 

3.10.2 Natural Major Scale 

Ptolemy's i dea ofa natural musical scale, first revi ved by Ramos, were redi scovered again in the early 
Renaissanceand champíoned by medieval theorists, including Lodovico Fogliano in Música Theo- 
retica (1529). Around that time, the famous Renaissance music theoretician Gioseffo Zarlino 
(1517-1590), in Institutioni Armoniche (1558), used the same basic ideas to create a scale based on 
the ratios 4:5:6, whichform ajust major triad. If wetake4/4 astherootof the triad, the major third 
above i s 5/4, and the f i fth above i s 6/4. This triad i ncorporates the major third (5/4), minor third (6/5), 
and perfectfifth (6/4 = 3/2). W hilethePythagorean scale was builtfrom the integers 1 to 4, this scale 
uses integers 1 to 6. Zarlino called this set the numero senario and, likethe Pythagoreans, found a 
mystical significanee in it and sought to establish it as the proper foundation of harmony. 

There are three major triads in thejust diatonic scale: C:E:G, F:A:C, and G:B:D (figure 3.13). 
InZarlino’sscale, thefrequenciesof thesethreetriadsare perfectly i n agreement with theharmonic 
overtone series. N oti ce the presence of the prime number 5 in the 4:5:6 ratio, making this a 
five-limit scale. 



M usical Scales, Tuning, and I ntonation 


C 1 D 1 E F G A B C 2 D 2 
4 : 5:6 

4 : 5:6 

4 : 5:6 

Figure 3.13 

Natural majorscale. 




62 


C hapter 3 


Scale ratio 


Interval ratio 


1 9 5 4 3 5 15 2 

1 8 4 3 2 3 8 1 

C D E F G A B C 

9 1 io r if 1 9 io 1 9 

8 9 15 8 9 8 15 

N V X y' X 

T wo sizes of whole steps 


Figure 3.14 

Natural major scale with interval sizes. 


1 


.5 .3 2 

'1 


M 3 m3 P4 
5 6 4 

4 5 3 


Figure 3.15 

Major triad. 


Figure 3.14 shows the natural major scale with intervals between the pitches in the bottom 
row. 

Although the natural major scale succeeds at making the thirds consonant with the harmonic 
series, itdoes so at the expense of the whole steps, which now are uneven i nsize.Some whole steps 
are 9/8, but others are 10/9. W hereas in the Pythagorean scale the majorthirds were "too big," here 
some of the whole steps are "too small." 

3.10.3 Natural M inor Scale 


Aswesaw with the natural major scale, the rati os of the major triad arethe ratios 4:5:6. The major 
triad consists of a reference frequency R plus a major third up, R ■ (5/4), plus another minor 
third up, 


5 6 _ 30 D _ 3n 


Figure 3.15 shows the pitch ratios of a major triad plus the octave. Noticethattheorder of the 
intervals is 


5.6.4 
4 ' 5 ' 3' 

that is, a major third, a minor third, and a perfect fourth. 




M usical Scales, Tuning, and I ntonation 


Ci D, E 

F 

G 


B|> C 2 D 2 

10 : 12 


15 




10 


12 

: 15 



10 


12 : 15 


Figure 3.16 

Justminorscale. 


We could create a just mi ñor scale if we could reverse the order of the 5/4 and the 6/5 intervals, 
creating a triad i n the order mi ñor third, majorthird, perfectfourth.Then we'd have something like 
this: 


1 : ? : ? : 2 
m3 M 3 P4* 


Butwhatarethe ratios of the pitches in this case? We'relooking for something Iike the integer ratio 
4:5:6 butthatproducesaminortriad.Supposewejuststackupwhatwewanttheorderto be, likethis: 


l ■6.r§.5 = 3'| 


(f-H) 


This produces the right sequenceof mi ñor third, majorthird, and perfectfourth, but the ratios don’t 
come out as whole numbers: 


expressed as decimal fractions is 1:1.2:1.5:2. 

Sincethis is not a ratio of integers, it can't bethebasisof a properjustscale. B ut wecould salvage 
this and make itinto a ratio of i ntegers justby multiplying all ratios by 10, likethis: 10:12:15:20. 
With this ratio, we can properly form the just mi ñor scale (figure 3.16). 

3.10.4 M ean-Tone Tempered Scale 

A nother transí ti onal attempt to create a transposable scale based on simple ¡nteger ratios was the 
mean-tone tempered scale. It is a fascinating exercise in music engineering. 

Temperament represented a radical departurefrom the just scales of the past. I 've already used 
the term to refer to the equal-tempered scale. In this context, tempering means the practice of 
adjusting some of the degrees of the scale to "irrational" valúes so as to fit within an overarching 




64 


C hapter 3 


Figure 3.17 

Constructing the mean-tone scale, step 1. 

order that is still based on simple integer ratios. The meaning of temperament is the same for the 
equal-tempered scale, butthere the appl i catión is to all pitchesof the scale uniformly. 

But what does "irrational" mean? A rational number is a number that can be represented as a 
ratio of two i ntegers. T he val ue of n i s i rrati onal because there i s no rati o of i ntegers that can pre- 
cisely represent it. Another exampleof an irrational number is J2. 

Constructing a Mean-ToneTempered Scale The mean-tone tempered scale starts with the 
same three natural majorthirdsthatwereused for the natural major scale. Fivewholetonesandtwo 
semitonesarederivedfromthethirds.Thegoal is to use only perfect 5/4 major thirds so as to preserve 
consonanceacrosstransposition and modulation.Theintended i mprovement over the natural major 
scale isto do something aboutthose pesky uneven wholesteps by bending, ortempering, them to fit. 
Wecan develop the mean-tone tempered scale in the following way: 

1. A s with the natural major scale, we want to have three puré 5/4 major thi rds between C :E, F :A, 
and G:B (figure 3.17). We still need to nail down the relation between D and its neighborsC and 
E, and we must do the same for G and its neighbors F and A. 

2. Wetacklethe major seconds between C:D:E, F:G:A, and G:A:B. Here's wherethetempering 
comesin. What ifwesimplycut the interval of the puré 5/4 major thi rd in half tocreatetwo whole 
steps, that is, ifwetookthemean va/ue of a puré majorthi rd?(Thisiswhere the scalegets its ñame.) 
What is its mean valué? It wouldn't be 5/8, the arithmetic mean, because pitch is exponential in 
frequency. To add intervals we must multiply their ratios, and we are looking for one ratio that 
when multiplied by itself (that's the clue) adds up to a 5/4 major thi rd. Such a ratio would beauni- 
form divi sion of the major third. What we are looking for is 75/4. thegeometñc mean. Thisallows 
us to fill in the major seconds (figure 3.18). 

3. Wemustfigureouttheinterval sizeof the two minorseconds, E:F and B:C. Until wedefine 
them, we have two disconnected islands of tonality, C:D:E and F:G:A:B, We must create 
two equal-sized half steps that fill in the difference between the sum of the whole steps and 
the octave. Fortunately, the minor seconds yield to the same logic that created the major 
seconds. 

There are two gaps in our scale that we want to fill with minorseconds. Letsbethe(asyetunde- 
fined) sizeof a minor second. Weneed two such minor seconds, ors 1 2 3 , because when weaddintervals 



M usical Scales, Tuning, and I ntonation 



Figure 3.19 

M ean-tone tempered scale. 


wemultiply theirrati os. We observe thatthere are fivewholestepsof sizeV5/4-Wewanttwosem¡- 
tonesofsizesplusfivewholestepsofsizeV5/4toadduptoanoctaveofsize2/1. An informal equa- 
tion for this might read, 2 semitones + 5 whole steps = octave. That translates into the equation 



Now wesolve this for s, as follows. 
Take the square root of both si des: 



I sol ates: 


V ( 5 / 4) 5/2 = ( 5 / 4) 5/4 


(3.12) 


The entire scale can now be constructed (figure 3.19). 



C hapter 3 


So, afterall this mathematical heavy lifting, whatdoesthisscalesound like? Was itworth the 
effort? Well, the improved uniformity does allow for greater transposition, but in the end (of 
course) we still have problems: thefifth is no longer a simple 3/2. The half steps and whole steps 
are not simple either, so we're really no closer to having a scale that can transpose and that also 
Unes up with musical instrument harmonic overtones. 

3.11 The Powersof the Fifth and the Octave Do Not Form a Closed System 

I f we step back to I ook at al I these efforts over the centuri es to bui Id the perfect scale, it’s as though 
weweretrying to build a bridge but couldn't ever find adesign thatwas sufficiently proportional. 
There's always a piece that doesn't fit. M y impression of the mean-tone scale is that it's like a 
carpentry project gone awry: the main boards are cut right, but the carpenter had to bend the rest 
into place and forcefully nail them down or they would spring loose again. 

Theproblem is, simple integer ratios don't Iine up theway we'd like. For instance, as we trans¬ 
pose around the circle of fifths, we logically expect to come back to our starting key. That is, start- 
ing on C, if wego up by fifths, we expect to return to C in a higher octave: 

C,G,D,A,E,B,F|, Ci, A ti E tl B t , F, C. 

♦_4 


But if we use the simple 3/2 ratio to go up by fifths, and use the 2/1 to go up by octaves, the two 
series don't end up on the same frequencyfor C atthe top. As wego through the 12 keys, we're 
adding fifths, which means we multiply their ratios. Twelve fifths would be (3/2) 12 = 129.746, 
which is just a little over seven octaves. B ut seven octaves exactly would be (2/1) 7 = 128. So they 
don'tlineup. Stated anotherway, 

[3/2]" i 
( 2 / 1) 7 

In fact, it can be proven that there are no integers m and n such that 


(3.13) 


apart from the tri vi al sol uti on m=n = 0. C ontrary to the wishes of scal ebuilders and musicians from 
antiquity to the present, the powers of the integer ratios 3/2 and 2/1 do notform a closed System. 

I f there i s no exact sol uti on to (3.13), then w hat about approxi mate sol uti ons? H ow el ose to equal 
can we get for any possi ble combinad on of m and n? The optimal sol uti on appears to be m = 12, 
n = 7. The interval corresponding to this choice of m and n is 


(3/2) 1 
( 2 / 1 ) 


1.01364 = 23.46 cents 


(3.14) 



M usical Scales, Tuning, and I ntonation 


RecalI from section 3.9.1 that the ratio in (3.14) is known from antiquity as the Pythagorean 
comma. While the distance by which the interval of 12 fifths misses seven octaves is a mere 
23.46 cents, in this case, a miss isasgood as a mi le. 

Itseems that simple integer ratios raised to arbitrary powersdon’t necessarily form a closed Sys¬ 
tem and that the particular case of interest, (3/2) m = (2/l) n , has no solution. The signifi canee of this 
is that making a closed cyclicscale System based on múltiples of fifths and octaves can'tbedone 
with simple integer ratios. A closed scale System is required in order to allow music to betrans- 
posed to any key and still sound in tune, so a transposable scale based on small ¡nteger ratios is 
impossible, and a tempered scale must be used if transposing is really that important. 

T he I ess the i nterval s of a seal e are tempered the better, because then the tempered i nterval s w i 11 
sound less dissonant against the harmonio overtone series. The Pythagorean comma suggests to 
the tempered seal e devel oper w here best to el ose the cyd e of f i fths and octaves. I f 12 fifths are f I at- 
tedtoequal seven octaves, theoverall distortion in the fifths wiII be only 23.46 cents. This is the 
rationale for building the equal-tempered scale with 12 semitones. 

Is there any other combination of m and n that comes closer to unity than the Pythagorean 
comma? Suppose we eval uate 

(3/2)! 

( 2 / 1 )" 

for values of m and n over some range, say, 0 to 100 each, looki ng for scale Systems that come as 
cióse or closer to unity than does the scale System for m = 12, n = 7. Some candidate entries are 


m 

n 

Cents 


12 

7 

23.46 

Pythagorean comma 

41 

24 

-19.85 

AII fifths would haveto bestretched 

53 

31 

3.62 

Very cióse to unity 

94 

55 

-16.23 

AII fifths would haveto bestretched 


A posi ti ve cents val ue i ndi cates that the fifths are Sharp by that amount, and a negati ve val ue indi- 
cates they are fíat. Perhaps the most interesting resultis that 53 fifths are only 3.62 cents Sharp of 
31 octaves. Both 31 and 53 have been used to build scales. 


3.12 DesigningUseful Scales RequiresCompromise 

Given the limitations of the just tuning Systems, wefind ourselves at afork in the road: 

■ Wecan movetoward our original goal of transposing while retaining the just ratios— but with 
compromises. 

■ We can abandon the goal and choose another. 



C hapter 3 


Although we still wantto engineer a scale that meets our needs, now weknow it's just a design 
problem, notaquestfora holy graiI thatwenow know doesn'texist. 

Some common choices that have been made at this j uncture i ncl ude the fol lowi ng: 

■ Extend the use of tempering (see section 3.13). 

■ A dd more degrees to the j ust scales, al I owi ng musí cians to use alternati ve rati os w hen transpos- 
ing (see section 3.14). 

■ Avoid transposing and modulation (see H industani Scales in section 3.14.2). 

3.13 Tempered Tuning Systems 

Tempering is a compromise that abandons someaims in order to achieveothers. If wegive up the 
goal of just rati os, we’d still liketo have a scale that 

■ Is transposable to alI 24majorand minorkeys 

■ Sounds cióse enough to thejust diatonic scale 

■ Has intervals reasonably cióse to their small-integer rati o prototypes 

■ H as 12 half steps to the octave 

■ Can be transposed around thecircleof fifths 

■ Has no strange differences between supposedly same-sized intervals 

To implementthis compromise, we use tempering to cióse the cycleof fifths and octaves. Whatif we 
spread the Pythagorean comma across a number of intervals so that it would become unnoticeable? 

3.13.1 O riginsof Tempering 

The concept of a tempered scale aróse in the fourth century b.c.e. with A ristoxenus of Tarentum, 
oneof A ristotle’s students. A ristoxenus argued empirically that precise rati os should be less impor- 
tant to music theory than what musicians actualIy use, and suggested that the octave be divided on 
a subjective basis into an equal number of intervals. To the same effect, the great mathematician 
Leonhard Euler (1766) wrote, "Thesense of hearing isaccustomed to identify with a single ratio, 
all the ratios which areonly slightly differentfrom it, so that the difference between them bealmost 
imperceptible." What Euler is referring to is now called thejust noticeable difference (J ND) of 
pitch (see chapter 6). A nother perspective on E uler's insight is the power of our minds through con- 
ditioning and learning to generalize a rule across similar instances (see section 9.22). 

Perhaps the fi rst practi cal temperi ng system was proposed by V i ncenzo G al i I ei, f ather of G al i I eo 
and a one-time student of Zarlino. Like many, including the Pythagoreans, Zarlino believed that 
certain proportions had a mystical significanee that revealed the hand of God. Vincenzo Galilei, 
trueto his Renaissance culture, believed that all scales werefreecreationsof the human mind and 
henee could be anything that pleased their creators (V. Galilei 1581; Strunk 1998). He proposed 
solving theconundrum of intonation by using the integer ratio 18/17 as an approximation of the 



M usical Scales, Tuning, and I ntonation 


semitone. At 98.96 cents, this ratio provides a usable tempered tuning system that has been 
employed by fretted instrument makers ever since (see section 3.15). 

Between the Renaissance and the modern age, Western music theorists tried many ways to hide 
the Pythagorean comma and yet salvage as many just intervals as possi ble— usualIy thefifths and 
majorthirds— whileexcluding thewolf tones by the use oftempering.Therearean unlimitednum- 
ber of possible temperings, but the available solutionstend to clusteraround afew common aims, 
dependíng upon whatonewantsto optimize: 

■ Mean-tone Optimize the thirds and fifths in selected keys, and never mind the rest. 

■ Well-tempered M ake all keys usable, but make some more purely intoned than others. 

■ Equal-tempered M ake al I keys sound the same. 

3.13.2 Well Tempering 

Theterm well tempered covers all tuning Systems that temper atleast some intervals or that have 
reasonably equal-sized semitones. 

Andreas W. Werckmeister (1645-1706) developed a number of tempered tunings, including 
Werckmeistertemperamentlll, which he developed in 1691. Roughly speaking, this scale leaves 
the bl ack notes i n Pythagorean j ust i ntonati on and tempers the w hite notes, resul ti ng i n vari ous-sized 
majorand mi ñor i ntervals and eithertrue or nearly truefifths and fourths. Such irregular tempering 
essentially scatters bits of the Pythagorean comma widely, though not evenly, across the scale, 
allowing fairly graceful transposition and modulad on to remóte keys. 

Other irregular temperaments of the ti me included 

■ Kirnbergertemperament III (1779), byjohann Philip K i rnberger (1721-1783); some fifths are 
tempered, some are puré. 

■ Valotti temperament (1728), by Francesco Antonio Vallotti (1697-1780); the "front" six fifths 
of the ci relé of fifths (F, C,G,D,A,E,B) are tempered by 1/6 of a Pythagorean comma, whereas 
the fifths on the "back" side are tuned puré. 10 

■ Young temperament 11 (1800), by ThomasYoung (1773-1829); si mi lar to Vallotti’s but starti ng 
on C ratherthan F. 

3.13.3 Tonal Palette 

Asaconsequenceoftheunevendistributionof the Pythagorean comma in irregular temperaments, 
each key was imbued with a unique tonal palette or coloration based on the placement of the 
vari ous-sized ¡ntervals in its scale. Farf rom being a probl em, thisaspect of i rregu lar temperaments 
was appreciated by composers and performers of those times as lending character to the different 
keys. M odulating around the ci relé of fifths in irregular temperaments alters the tensión in the 
triads and dominant seventh chords in characteristic ways thatthey found musically useful. 

I n the I i terature on tuni ng Systems, the arguments for and agai nst the various tuni ng Systems 
sound as though they were referring to winetasting. Werkmeister III is puré in the best keys 



70 


C hapter 3 


and excellentfor organs because many fourths and fifths are in tune, but it is irregular and quix- 
otic in how it handles modulation, and uneven in key color. Vallotti is smooth and regular, 
perhaps with too little key contrast. It is clear that the choice of tuning system was a matter 
of taste. 

A common misconception aboutj. S. Bach'sfamous Das Wohltemperierte Klavier 11 is that it 
waswritten as a demonstradon pieceforequal-tempered tuning. Bach almostcertainly did notuse 
equal temperament, whichdid notcomeinto practical useuntil afterhedied. Heundoubtedly used 
a mean-tone or irregular temperament of some sort, possibly one of Werckmeister’s or one of his 
own devising. Which exact tuning he used is unknown, but it is certain that Bach used this com- 
position as a vehicle to systematically explore the tonal palettes of the keys of the temperament 
he was using (Barbour 1947; Barnes 1979; Kellner 1979). 

3.13.4 EqualTempering 

The attempt with irregular temperaments to include some puré ratios only hides the intonational 
probl ems i n remóte keys. B ut as composers devel oped and extended functional harmonization and 
modulation, eventually there were no "remóte" keys left in which to hidethe wolves. Why then 
nottry tempering every degree of the scale in the same amount? Perhaps that would spread out 
the Pythagorean comma to the extent that it would become unnoticeable because the "out-of- 
tune-ness" would be everywhere the same. 

What if we shrank the interval of a fifth just a little so that 12 of them would equal seven dou- 
bl i ngs of the starti ng pitch? L et's ñame the tempered fi fth T 5 . T hen we woul d be I ooki ng for a val ue 
of T 5 such that (T 5 ) 12 = 2 7 . Sol vi ng for T 5 g¡ ves T 5 = 2 7712 = 1.498, which is pretty el ose to 3/2 = 1.5 
(although the fifths are a little fíat). To generate the 12 steps of the scale, all we would haveto do 
isform successive intervals of T 5 , and after creating 12 of them, we would be back to where we 
started, a few octaves higher. 

While the equal-tempered scale takes the approach of tempering the fifth according to 2 7712 , 
another equally valid approach is to shrink the semitone according to u j2 = 1.0594631, which 
is reasonably cióse to the minor second, 16/15 = 1.0666667. The two approaches are equivalent, 
since the result either way is that the octave is divided into 12 equal intervals. 

Curiously, this quintessentially Western scale appears to have been first invented in China. In 
1596, PrinceChuT sai-yu (orZhu Zai-You) apparently calculated the degreesof the equal-tempered 
chromatic scalewithout benefit of logarithms (Barbour 1953; Kuttner 1975; Yasser 1932). How- 
ever, itevidently did notcatchonin China as it didin the West. The ¡dea was apparently putforward 
fi rst in Europe by Simón Stevin (1548-1620). 12 The theory became widely known through the 
work of M ersenne (1635). But equal temperamentdid not become generall y establ i shed in practice 
until 1800, fi rst in Germany, later in England and France. 

3.13.5 Interval Error of Equal-Tempered Tuning 

Astonishingly, the equal-tempered intervals are cióse enough to the natural major scale that most 
Western composers and musicians from the 1800s to the present have been satisfied with the 



M usical Scales, Tuning, and I ntonation 


71 


Table3.6 

Comparison of Natural and Equal-Tempered Chromatic Intervals 


Degree 

Ñame 

Error 

Degree 

Ñame 

Error 

1 

Unisón 

0.0 

7 

Tritone 

-9.7763 

2 

Mi ñor second 

-11.731 

8 

Perfect fifth 

-1.955 

3 

M ajor second 

-3.910 

9 

M i ñor sixth 

-13.686 

4 

M ¡ñor thi rd 

-15.641 

10 

M ajor sixth 

15.641 

5 

M ajor third 

13.686 

11 

M i ñor seventh 

3.910 

6 

Perfect fourth 

1.955 

12 

M ajor seventh 

11.730 


\ 

16 9 6 5 4 

15 8 5 4 ■ 

I 64 3 8 5 16 15 2 

i 45 2 5 3 9 8 í 

Natural 

chromatic 

scale 






Equal-tempered 

chromatic 

scale i 






C 

Cf D D| E F 

F 

} G G j A A 

I 1 

3 C 


Figure 3.20 

Natural and equal-tempered chromatic intervals. 


equal-tempered chromatic scale, and a very large body of music has been composed using it. I ron- 
i cali y, however, there is not a single small integer ratio left in thescale (apart from the unisón and 
octave). Thus, one of the pri ncipal aims of the early scale builders has been Iost. CIearly, the desi re 
for transposabiIity won out over justness of ¡ntonation in Western music after theadvent of tem¬ 
perad tunings. 

But just how badly out of tune is equal temperament? Table 3.6 shows the size of the error in 
cents between each equal-tempered degreeand its natural chromatic scale equivalent. Thesign of 
each valué in the E rror column shows the cents by which the equal-tempered scale is Sharp (pos- 
itive) orflat (negative) with respectto itsj ustequivalent. Note thattheworst errors are fortheminor 
and major thirds and sixths (figure 3.20). 

3.13.6 G oodness-of-F it M etric 

Wecangetacrudequantitativeideaof how closely aligned thesetwo scales are by addi ng the mag¬ 
nitudes of the Error column in table 3.6. Doing so shows that the sum total by which all temperad 
¡ntervals miss their natural chromatic scale equivalents is 103.624 cents. Is 103.624 cents 
accumulated error goodorbad? A re these differences significant? That analysis is postponed until 
section 3.14, so that more scales can be evaluated. 




72 


C hapter 3 


M eanwhile, wecan adapt this goodness-of-fit measureto other scales discussed in this chapter 
to show in quantitativeterms how closely aligned they are with the intervals of the natural chro- 
matic scale. For scales with many more degrees than the chromatic scale, the method is to first 
pick the degrees that are closest to thei r natural chromatic equivalents, then sum the magnitude of 
the errors. 

3.13.7 TheGrand Solution 

The equal-tempered scale inherits nearly all the important components of the Pythagorean scale 
and can also transpose. Now every key sounds as in tune (or out of tune), as every other key, just 
as wewanted, but at the expense of the puré integer ratios, which have been virtually banished. 
Itissomewhat reminiscent of themodern practicewherean oak grove is ri pped outto buildashop- 
ping center and then theshopping center is named Oak Grove. Weare left with the impression of 
the puré intervals but not with their real i ty. We get the advantage of the modern conveniences 
(transposition) but at the expense of the reason wewanted it. I sn't it i nteresting that not even music 
Is immuneto the inevitable downsideof technological advance? The moral: nothing isfree. 

Other cultures have made other choices. For instance, classical H industani and A rabie music 
is still firmly rooted in small ¡nteger ratio scales, and that music scintillates with a pleasurable 
harmonicity that has touched adeep longing in the Western ear, asevidenced by their popularity 
in the West in recent times. The symmetry between the overtones of their instruments and the 
scales they play upon is deeply satisfying. On the other hand, don't expect an oud or a sitar to 
transpose. 

3.14 M icrotonality 

A s descri bed i n the previous section, the compromise of tempered tunings is to give up the use of 
small ¡nteger ratios exceptfor the unisón and octave. The compromises of microtonality are not 
as neatly assessed because of the greater number of directions that can be taken. 

One of the main thrusts of early Western microtonal tunings wasto increase the number of scale 
degrees on keyboards. The original aimwastosupplyalternative choices of intervalswhen modulat- 
ing ortransposing so asto retai n as much as possi ble the simple i nteger ratios of the just scales. Such 
a scale systemwould then contai nm/crotones, which are scaledegrees that are smaller than asemitone. 

Once again, however, weconfront basic design questions. For instance, are the microtones to 
beorganized as asetof tempered intervalsor asacollection of small ¡nteger ratios? Of course, there 
are exponents of both approaches, and I consider each in turn. 

3.14.1 Tempered M icrotonal Scales 

W hat if we simply i ncreased the number of equal divisions of the octave from 12 to a larger num¬ 
ber? As the number of equal divisions of the octave goes up, not only will there be more scale 
degrees to choose from but there i s al so an i ncreased I i kel i hood that some of them w i 111 and el oser 



M usical Scales, Tuning, and I ntonation 


tothejustintervalsthantheirchromatictemperedcousinsdo.A trivial modification of equation (3.3) 
allows us to create arbitrary tempered divisions of the octave: 

f ktV = f R - 2 v+k/N , (3.15) 

whereW isthedesi red numberofdegrees per octaveáis the i ntegerdegreenumber, and vistheoctave. 
A few such tempered scale Systems approxi mate the j ust i ntervals better than the chromatic scale. 

19-Tone Scale A nother reí ativel y el ose encounter between the seri es of fifths and octaves occurs 
at 19 fifths above 11 octaves, where the fifths exceed the octaves by 137.145 cents. When N in 
(3.15) issetto 19, the size of the equ i tempered scale división is 63.16 cents. 

Why 19?Thel9-tonemajorand mi ñor thi rds and majorand mi ñor sixths are al I closerthan the 
corresponding equal-tempered intervals. The minor third is quite puré. The major third is fíat, 
although closer than the equal-tempered major third (see figure 3.21). 

To temper using this scale, the fifths must be flatted by a total of 137.145, which is worsethan 
the tempering required for the chromatic scale. Since there are 19 fifths, each fifth is fíat by 
7.218 cents, making the fifths farfrom perfect. 

Applying the goodness-of-fit metric to the 19-tone scale results in 109.31 cents accumulated 
error, notasgood asthe chromatic scale's 103.624 cents. I n spite of theimproved thi rds and sixths, 
this scale has not been favored over chromatic equal temperament for good reasons. 

Quarter-Tone Scale WhenW in (3.15) is set to 24, we arriveat the quarter-tone scale, and the 
size of the equitempered i nterval i s 50 cents, or exactly one-half of a chromati c tempered semi tone. 

Whileall microtonal scales can produce exotic-sounding harmonies, the quarter-tone scale 
is special because it is a superset of the equal-tempered scale. Or, we can think of it as two 
equal-tempered scales combined, tuned 50 cents apart. A common arrangement for quarter-tone 
music isto tune two pianos 50 cents apart. Listen, for example, to ThreeQuarter-Tone Pieces by 
Charles I ves, or the compositions of A lois H ába (1893-1973). 

Dependi ng upon how the additional resources are used by a composer, quarter tones can extend 
the tonal paletteof the equal-tempered scalesothatitrangesfrom strictly harmonic (using either 
of the equal-tempered scale subsets) to mixtures that are reminiscent of the irregular tempera- 
ments, to highly dissonant when using all the quarter tones together. The composer is given addi¬ 
tional possibilities of harmonic tensión. 

A s one might expect, the goodness-of-fit metric for the quarter-tone scale is the same as for the 
equal-tempered scale, 103.624 cents. 

53-Tone Scale The next cióse encounter of the fifths and the octaves occurs at 53 fifths and 
31 octaves. H ere the cydeof fifths ends up merely 3.615 cents above the octave. Each tempered 
fifth is therefore 3.615/53 =0.068 cents fíat. According to Helmholtz (1863), this scale was first 
proposed in 1608 by Nicolaus M ercator (1620-1687) as a System for measuring scales. 

Even Partch (1947) is impressed with this scale. Hesaysitgives"adegreeof falsity that might 
real I y be cal I ed— and for the fi rst ti me I use the word w i thout quotati on marks— i nconsequenti al." 



74 


C hapter 3 


Table3.7 

Comparison of Natural Chromatic Scaleand CentScale 


Natural Natural 

Chromatic Cent Error Chromatic Cent Error 


1 1 0.0 

2 113 -0.27 

3 205 -0.09 

4 317 -0.36 

5 387 0.31 

6 499 0.04 


7 611 -0.22 

8 703 -0.04 

9 815 -0.31 

10 885 0.36 

11 997 0.09 

12 1089 0.27 


Natural 

12-tone 

19-tone 

Quarter-tone 

53-tone 



I I 
INI 


I I 


Figure 3.21 

Tempered microtonal tunings compared to the natural chromatic scale. 


H isestimation agreeswith thegoodness-of-fitmetric, which is 10.402 centsaccumulated errorfor 
the 53-tone scale, which far surpasses the chromatic scale's 103.624 cents. 

T he Cent Scale as the Ultímate Tempered M icrotonal Tuning The cent scale itself is the 
logical reductio ad absurdum of this progression of tempered microtonal scales. With its 1200 
degrees per octave, i t can be thought of as the ul ti mate tempered mi crotonal scal e. W hy not si mply 
compose di rectly i n cents? Table 3.7 shows which cent degreescorrespond mostclosely to the nat¬ 
ural chromatic scale. As might be expected, the goodness-of-fit metric for the cent scale is by far 
the best of the bunch: 2.38 cents accumulated error. 

ComparingtheTempered M icrotonal Scales Figure 3.21 compares tempered microtonal 
tunings to the natural chromatic scale, which is shown as a ruler in the background for comparison 
w i th the other scal es. T he fai rl y crude resol uti on of thi s visual ai d sti 11 reveal s a I ot about the accu- 
racy of the approximations these scales maketo just ratios. It is evident, for instance, how much 
better the 19-tone scale’s thi rds and sixthsarethan those of the equal-tempered scale. Italso shows 
how much better the 53-tone scale is than all the rest at approximating the just intervals. 

Table 3.8 summarizes the goodness-of-fit metric for the tempered scales considered above. As 
expected, increasing the number of divisions of the octave makes it possi ble to approximate ever 
moreclosely the just diatonic scale by judicious choiceof tempered microtonal intervals. 




M usical Scales, Tuning, and I ntonation 


75 


Table3.8 

Goodnessof Fit 


12-tone 103.624 

19-tone 109.310 

Quarter-tone 103.624 

53-tone 10.402 

Cent 2.37599 


The fact that even the cent scale has an accumulated error, however small, is noteworthy. 
Human hearing can'tdistinguish between adj acent cents (which isonereasonitwasdeveloped). 
So does it matter that the cent scale has a nonzero accumulated error, especially if it is much 
smaller than human hearing can detect? Haven't we provided ourselves with a way to temper 
a scalethatisforall practical purposes indistinguishable f rom thejust intervals? Rememberthat 
Western musical culture has Iived happily with the errors in the equal-tempered scale for cen- 
turies. Nonetheless, therearethosewho criticizethewholeapproach to tempering intervalson 
principie. 

3.14.2 J ust M icrotonal Scales 

No matter how cióse they cometo the just intervals, tempered microtonal scales do notmeetthe 
needs of what I cali the ¡ntonation rationalists like Partch becausethey are nothing more than 
approximations (al beit sometí mes pretty good approximations) to thejust intervals. From the 
perspective of the ¡ntonation rationalists, the whole idea of tempered intervals is like the dif- 
ference between 3.14 and n. It's like chopping down a forest and replacing it with telephone 
poles. They are not the same as the trees, no matter how cióse they might stand to where trees 
oncestood. 

J ust tuning Systems using microtones are quite widespread, including fifteenth-century 
European scales, tuning Systems from cultures around theworld, and Systems constructed by con¬ 
tení porary theorists and composers. In Europe microtonal just scales were originally developedto 
improvetransposability. In the classical music of Hindustan and thetraditional music of Islamic 
countries, microtonal just scales are used without transposition. The American theorist Harry 
Partch also developed an el aborate just microtonal scale. This section explores a small sampling 
of tuning Systems using just microtonal intervals. 

Historical European Microtonal Scales A ccordi ng to M urray B arbour, j ust-i ntonation micro- 
tonal scales manifested in Europe in the latefifteenth century with theintroduction of keyboards 
with spl it keys, for i nstance, for E^ and D», to avoi d the bad effects of transposi ng on j ust keyboards. 
Barbour (1953) writes, 

Thetheory was simple enough: provideat leastfoursetsof notes, eachsetbeing in Pythagorean tuning and 
forming just major thirds with the notes in another set; construct a keyboard upon which these notes may be 
played with the mínimum of inconvenience. O ni y in the design of the keyboards did the inventors show their 
ingenuity, an ingenuity that might better have been devoted to something more practical. (113) 



76 


C hapter 3 



Figure 3.22 

J ust keyboard by J oan Albert Ban. 

Figure 3.22 shows the keyboard developed by Joan Albert Ban (1597-1644) in Haarlem in 
1639, based on the theories of Fogliano and M ersenne. Through the addition of floating keys 
and split keys, each natural key is provided with all possi ble justly intoned triads, major and 
minor. Thefloating D» provi des the 9/8 above C, whi le the natural D below it provides the 10/9. 
The D-major triad D:F»:A startson D natural and isspelled 3240 : 2592 : 2160, and the G major 
triad startson G natural and is spelled 2400 :1920 : 3200 (requiring useof thefloating D). 

But adding microtones to the keyboard proved to be a dead end. They were difficult and tem¬ 
peramental to build and to play, and no common scheme emerged as a rallying point. Electronic 
keyboards that became available in the twentieth century helped revive interest in microtonal 
scales. Harry Partch builtan entireorchestraof acoustic Instrumentsusing variousmicrotonal lay- 
outs. Flowever, all haveremained idiosyncrasies. With the introduction of the personal Computer, 
¡tfinally became possi ble to experiment with these scales without having to construct elabórate 
physical keyboards, and there has been a resurgence of research interest. If a new tolerance for 
diversity develops, this music may yet get its proper hearing (Keislar 1988). 

Partch’s43-ToneScale H arry Partch is arguably thefather of modern microtonal i ty. H is fun¬ 
damental reexaminad on of thefoundations of music theory and his consequent radical departure 
from musical conventions are described in minute detail in his book Génesis of a Music (1947). 
The di rection of histhinki ng required that he create an entire orchestra of origi nal i nstruments and 
composea body of musical worksfor it. Hesaid of himself, "I am a composer seduced into car- 
pentry," 13 but he was al so a brilliant theorist. Though he took issue with many accepted musical 
dogmas of hisday, heis principally remembered for his stance on intonation. 

H e felt that the approxi mations that the chromatic equal-tempered tuning system made to the 
puré small integer ratios were a travesty to the ear. For instance, he wrote, 


After hearing an absolutely true triad onefeels that the tempered triad throws its weight around in a strangely 
uneasy fashion, which is not at all remarkable, for what it wants to do more than anything else is to go off 


M usical Scales, Tuning, and I ntonation 


andsitdownsomewhere— itactually requiresresolutionIThushasthecompositionof musicforthetempered 
scale become one long harried and constipated epic, a veritable and futile pilgrimage in search of that 
never-never spot- a place to sit! (179) 

He recognized that his vitriol could become excessive. "In attempting to correct an illogical 
situation a man tendsto become an extremist" (97). But he was a man with a mission. 

Heconsidered thepureuntempered ratiosto beuniqueindividualities, whichthetemperad tun- 
ings could only approximate. He created orders of tonal i ti es out of small integer ratios of various 
numerical limitsbased onscholarship, reasoning,and hisownear. By basing hissystemon integer 
ratios, henecessarily discarded closed, transposable, common temperad tuning for an open system 
populated by a plethora of ratios that were as individualistic as himself. 

Table 3.9 shows Partch's43-tone scale. Thetablegives thedegreenumber, theratio,the cents 
from unisón of the ratio, the ratio of the interval to the previous degree (the sizeof thestep), and 
the interval size in cents. Because of the increased intervalic resources, Partch categorized 
ranges of his i ntervals as having various emotional functionsroughly analogoustothosecommonly 


Table 3.9 

Partch's43-Tone Scale 


No. 

Ratio 

Cents 

Step 

Size 

Step 

Cents 

No. 

Ratio 

Cents 

Step 

Size 

Step 

Cents 

pí i Ó ! 

11 

10/7 

617.49 

50/49 

34.98 1 

2 

81/80 

21.50 

81/80 

21.51 

24 

16/11 

648.68 

56/55 

31.19 

3 

33/32 

53.27 

55/54 

31.77 

25 

40/27 

680.45 

55/54 

31.77 

4 

21/20 

84.47 

56/55 

31.19 

KL 

3/2 

701.96 

81/80 

21.51 1 

5 

16/15 

111.73 

64/63 

27.26 

27 

32/21 

729.20 

64/63 

27.26 

6 

12/11 

150.64 

45/44 

38.91 

28 

14/9 

764.90 

49/48 

35.70 

7 

11/10 

165.00 

121/120 

14.37 

29 

11/7 

782.49 

99/98 

17.58 

8 

10/9 

182.40 

100/99 

17.40 

30 

8/5 

813.69 

56/55 

31.19 

9 

9/8 

203.91 

81/80 

21.51 

31 

18/11 

852.59 

45/44 

38.91 

10 

8/7 

231.17 

64/63 

27.26 

32 

5/3 

884.36 

55/54 

31.77 

11 

7/6 

266.87 

49/48 

35.70 

33 

27/16 

905.87 

81/80 

21.51 

12 

32/27 

294.14 

64/63 

27.26 

34 

12/7 

933.13 

64/63 

27.26 

13 

6/5 

315.64 

81/80 

21.51 

35 

7/4 

968.83 

49/48 

35.70 

14 

11/9 

347.40 

55/54 

31.77 

36 

16/9 

996.09 

64/63 

27.26 

15 

5/4 

386.31 

45/44 

38.91 

37 

9/5 

1017.60 

81/80 

21.51 

16 

14/11 

417.51 

56/55 

31.19 

38 

20/11 

1035.00 

100/99 

17.40 

17 

9/7 

435.08 

99/98 

17.58 

39 

11/6 

1049.36 

121/120 

14.37 

18 

21/16 

470.78 

49/48 

35.70 

40 

15/8 

1088.27 

45/44 

38.91 

[Él 

4/3 

498.05 

64/63 

27.26 | 

41 

40/21 

1115.53 

64/63 

27.26 

20 

27/20 

519.55 

81/80 

21.51 

42 

64/33 

1146.73 

56/55 

31.19 

21 

11/8 

551.32 

55/54 

31.77 

43 

160/81 

1178.49 

55/54 

31.77 

22 

7/5 

582.51 

56/55 

31.20 

44 

2/1 

1200.00 

80/81 

21.51 1 



C hapter 3 


Midpoint 3 



attributed to thejust diatonic scale: 

■ Intervalsofpower, theperfect intervals— unisón (#1), octave (#44), fourth (#19), andfifth (#26), 
shown with heavy outline 

■ Intervals of suspense, the intervals in the región of the tritone from the fourth (#19) to the fifth 
(#26), shown with light shading 

■ Emotional intervals, the intervals i n the regionsof the thirds (#11 to #18) and sixths (#27 to #34), 
shown with heavy shading 

■ Intervals ofapproach, the intervals in the regions of the seconds (#2 to #10) and sevenths (#35 
to #43), shown with light outline 

11 i s i nteresti ng to observe the symmetri c reg ul ari ty of interval si zebetweenstepsofthis scale (fig¬ 
ure 3.23). The scale is not symmetrical at the fifth, but atthree degrees below the fifth, at number 
23—the midpoint of the interval order (see table 3.9). Note the plethora of different step sizes in 
figure 3.23. 

Figure 3.24 compares Partch's scale and the equal-tempered chromatic scale, with the natural 
chromatic scale shown as a background ruler. 

Hindustani Scales WhereasWesternmusichasemphasized harmoniopracticesrequiringtrans- 
position and modulation, classical Hindustani music has emphasized melodic practices that are 
based onjust intervals and do nottranspose.Thedegreesoftheclassical Hindustani scale are cal led 
sruti. The mostcommon scale has 22 sruti per octave. Continuous-pitch Instruments such as the 
voiceor sarod can adapt intonation asneeded to play any subset of this scale. Fretted Instruments 
such as the vina, sitar, and esraj aresupplied with adjustablefrets that can be shifted to adapt to 
different subsets of sruti intervals. The principal playing strings of these fretted Instruments can 
be pulled sideways across the frets, stretching the string to achieve other sruti as needed, and 
for ornamentad on. 



M usical Scales, Tuning, and I ntonation 


79 


16 9 

15 8 


4 64 3 

3 45 2 


12-tone 


43-tone 


Figure 3.24 

Partch's scale and equal-tempered chromatic scale compared. 


Table 3.10 

Hindustani 22-Sruti Scale 


Degree 

Ratio 

Cents 

1 nterval 

Size 

Degree 

Ratio 

Cents 

1 nterval 

Size 

1 

1/1 

0 



12 

45/32 

590.22 

25/24 

70.67 

2 

256/243 

90.23 

256/243 

90.23 

13 

729/512 

611.73 

81/80 

21.51 

3 

16/15 

111.73 

81/80 

21.51 

14 

3/2 

701.96 

256/243 

90.23 

4 

10/9 

182.40 

25/24 

70.67 

15 

128/81 

792.18 

256/243 

90.23 

5 

9/8 

203.91 

81/80 

21.51 

16 

8/5 

813.69 

81/80 

21.51 

6 

32/27 

294.14 

256/243 

90.23 

17 

5/3 

884.36 

25/24 

70.67 

7 

6/5 

315.64 

81/80 

21.51 

18 

27/16 

905.87 

81/80 

21.51 

8 

5/4 

386.31 

25/24 

70.67 

19 

16/9 

996.09 

256/243 

90.23 

9 

81/64 

407.82 

81/80 

21.51 

20 

9/5 

1017.60 

81/80 

21.51 

10 

4/3 

498.05 

256/243 

90.23 

21 

15/8 

1088.27 

25/24 

70.67 

11 

27/20 

519.55 

81/80 

21.51 

22 

243/128 

1109.78 

81/80 

21.51 


Barbour(1953,113) assumesthattheHindustani sruti scaleisbasedonan equal divisiónofthe 
octave i nto 22 parts, much as oneof thecommon A rabie scales isan equal división i nto 17 parts. 
He writes, "If these are considered equal, a new system arises with 'practically perfect' major 
thirds... and very sharp fifths" (116). J udging from their music, it seems very unlikely that 
Hindustani musicians would settlefor sharp fifths, however. M any sources give the sruti scale as 
an extended just system. This isa more satisfying explanation becauseit would giveahigh degree 
of consonance between the scale and the rich harmonic content of many H industani Instruments. 

Table 3.10 shows the i nterval s commonly given for the 22-sruti seal e. F igure 3.25 compares the 
22-sruti scale with the natural chromatic scale and the Pythagorean dodecaphonic scale. The 
sruti that are in neither of these other scales are shaded in the table and figure. According to 
tabl e 3.10 and fi gure 3.25, the 22-sruti seal e contai ns both the natural chromati c and Pythagorean 
chromatic scales as subsets, and contai ns four additional i nterval s that are not in either of the 



C hapter 3 


Natural | 
Sruti I 


Natural chromatic and 22-sruti scalescompared. 


5 6 7 8 9 10 11 12 13 14 15 16 17 


Figure 3.26 

I nterval structure of the 22-sruti scale. 


others. One can selectfrom this any combination of just or Pythagorean scales, plus a variety of 
other scales. A prime factor analysis of the 22 sruti ratios shows that this is a five-limit scale. 

Figure 3.26 shows the ¡nterval structure of the 22-sruti scale. There are three ¡nterval sizes: 
256/243,25/24, and 81/80. Pingle (1962,31) calis the smal lest i ntervals murchanas. I nterestingly, 
the size of the murchana ¡nterval corresponds to the Pythagorean comma. 

Why are there 22 sruti s? I wastold by my Hindú music teachers that the 22-sruti scale is basically 
chromatic. It contains both the natural and the Pythagorean chromatic scales. 14 The 22 degrees come 
from taking all chromatic i ntervals except the unisón and fifth, which arefixed, and splitting them into 
alowerand an uppermicrotonal ¡nterval. And, indeed, 2 ■ (12-2)+ 2 = 22degreesaltogether. While 
thisisagooddescriptionof whatweseeinfigure3.25, itisnotan explanadon. Anotherconjecturel've 
heardisthat22waschosen because the ratio of the 22 sruti tothediatonic scale degrees that anchor itis 

j = 3.14286 .. . = 3.14159 ... = n. 

Although the ratio of 22/7 was indeed used in ancient ti mes as a rational approximation ton, this 
is not a particularly compelling musical explanation (Beckman 1976). 




M usical Scales, Tuning, and I ntonation 


81 


y 11 :t 


i 


+ 


]\ I + \)\ Á I I I 

. I^ÍÍDiscarded) 


22-sruti scale as circle of fifths and circleof fourths. 


The most satisfying explanation I 've heard so far comes from Lentz (1961), who characterized 
the scale as a combinadon ofthecycleof fourths and thecycleof fifths. The process,whichismuch 
likethatdescribedforthePythagorean dodecaphonic scale, goeslike this: 

1. Create a set of intervals (3/2) m for 0 < m < 12. 

2. Create another set of intervals (4/3) n for 0 < n < 12. 

3. Subtract as many octaves as necessary to position each i nterval withi n the compass of one octave. 

Thiscreatesasetof 23 unique i ntervals (not 24 because the unisón isrepeated in both series). Fig¬ 
ure 3.27 shows the sruti scale of table 3.10 compared to the circle of fifths and circle of fourths. The 
interval 262,144/177,147 in the circle of fourths (just below the 3/2) must bediscarded, leaving 
22 sruti. 

Atfirstglance, Lentz’scombination of fifths and fourths looks very cióse to the 22-sruti scale. 
However, small discrepanciesareevidenteven in thiscrudegraphic: someof thepowersof fifths 
and fourths do not line up with the intervals given in the literature but are a little Sharp or fíat. In 
fact, the intervals that miss their mark are off by exactly 32,805/32,768, an interval historically 
called asch/sma. For instance, whilethethird degree in table3.10 isgiven as 16/15, thethirddegree 
by the circle of fifths is 2187/2048, which is a difference of 32,805/32,768. 

L entz's method has the advantage of bei ng a si mpl e and el egant constructi on, but I i ke the Pythagorean 
scale, the result may pleasetheorists morethan musicians. W ho is to say whether an oriental equivalent 
of Pareja di dn't argüe for a versión of the 22-sruti scal e made si mpl er by adj usti ng the sruti upanddown 
by schismas to nearby smaller integer ratios, leaving the conventional ratiosgiven in table3.10? 

The22-srut¡ scaledescribed heredoesnotby any meansexhausttheHindustani interestinthenumber 
22. A n interesting justdiatonic scale given by Pingle (1962) consists of thefollowing seven intervals: 

22/22, 26/22, 29/22, 31/22, 35/22, 39/22, 42/22, 44/22. 

Figure 3.28 compares Pingle’s scale with the 22-sruti scale and the natural just diatonic scale. It 
isan 11-1 imit scale with a mostexotic sound, asall of its intervals are quite Sharp in comparision 
to the just diatonic scale. 



C hapter 3 


22-5ruti | |j j j 
Pingle | 

Diatonic | j 

Figure 3.28 

B. A. Pingle'sdiatonic scale. 



Figure 3.29 

Ruleof 18 for placing frets. 


3.15 Ruleof 18 

The rule of 18 has been used by Western stringed instrument builders to construct the scales 
their Instruments play since it was first proposed as a tempered scale by Vincenzo Galilei (see 
section 3.13.1). It highlights a number of interesting mathematical principies. 

Itso happens that the sizeof a tempered semitone, the irrational number 12 V2, isfairly closely 
approximated by the rational ratio 18/17, that is, 

H/5 = 1.0588. 

It is much easier in practice for builders to work with ratios of integers than ¡rrational ratios 
when divi di ng up a linear di stance. Asshown i n figure 3.29, each string of afretted instrument 
is suspended between two points, the bridge and the nut. The frequency of the open string is 
determined by a peg or screw arrangement near the nut, which tightens or loosens it, varying 
the tensión of the string. The performer varíes the frequency by stopping off different lengths 
of the string against the fingerboard, thereby changing the mass of the part of the string that 
can víbrate. 

3.15.1 FretCalculations 

Fret wires placed along the fingerboard perpendicular to the string help the performer stop off 
exactl y the ri ght I ength to sound i nterval s i n the scal e that the i nstrument i s bui 11 to pl ay. U nfretted 
stringed instruments such as the violin are played similarly but do not have frets to guide the 




M usical Scales, Tuning, and I ntonation 


performer'sfingers. Frets are foremost an aid to ¡ntonation, butthey al so makeit possi ble to cor- 
rectly stop múltiple strings simultaneously, a useful featurefor polyphony. Historically, the frets 
are placed using the 18/17 tempered scaleof Galilei. 

To place the frets, the rule of 18 States 

E ach subsequent fretshould be located 1/18 of the remaining distance to the bridge of the 
¡nstrument. 

Let’stakeforan exampleastring of length x 0 =1 meter from bridge to nut (figure 3.29). Then the 
rule of 18 says that the distance x x from the bridge to the first fret should be 

v - v _ x o 
1 0 18 

= l-¿ First Fret (3.16) 


I n order to sound a semi tone higher, the rule of 18 says that the I ength of the stri ng from the bri dge 
to the first fret must be 17/18 of the length of the entire stri ng x 0 . 

The distance from the bri dge to thesecond fret,x 2 , iscalculated from the "remaining distance," 
which ¡sXi. So we subtract 1/18 of the stri ng from the length of x{. 

X2 = Xr l8 = l m ' SecondFret (3.17) 

3.15.2 T he F law in the R ule of 18 

If we continué to apply the rule of 18 twelvetimes, then the twelfth fret wi II end up being placed 
near the midpoint of the string. H owever, when the string is stopped at the twelfth fret, although 
ideal ly itshould sound exactly an octave higher than thewholestring, itwill actually sound slightly 
fíat because 18/17 < u Jl. Each fret placed by the ruleof 18 will sound slightly fíat, and the error 
wi 11 compound for higher-numbered frets because the position of each subsequent fret is derived 
from the previousone. For example, if the length of theopen string isx 0 = 1 m, then the position 
of the twelfth fret is approximately x 12 = 0.504 m instead of the desi red 0.5 m, which is where it 
should be to sound exactly an octave above the open string. 

H appi I y, another artifact of stri nged i nstruments comes to the rescue to a certai n extent. F retti ng 
a string bends it, decreasing its elasticity slightly, which raises its pitch slightly. By the nature of 
their construction, strings must be bent progressively more the higher the fret, which counteracts 
the Progressive flatteni ng of the ruleof 18. The precise amount by which the stri ng’s pitch israised 
by thi s stretchi ng depends upon the geometry of the i nstrument and the di mensions and tensi on of 
the string. In practi ce, many addi ti onal factorsmustbetaken into account by a stri nged ¡nstrument 
maker, a process cal led (appropriately enough) compensad on. 



C hapter 3 


AIternatively, if we shave off a littlefrom the rule of 18 and instead usethe "rule of 17.81715," 
we get fret distances that nearly match the equal-tempered scale, and x 12 = 0.500. 

3.15.3 Recursion 


The rule of 18 is an example of recursion, in which the next valué in a sequence depends upon the 
previous value(or valúes) in a well-defined way. Supposeweletf 0 be the frequency soundedwhen 
the open string in figure 3.29 is played. Then thefrequency of the string stopped at thefirst fret 
would be fi = f 0 ■ 18/17, and thefrequency atthesecond fretwould be f 2 = f i ■ 18/17. Gener- 
alizing, wecan find thefrequency of any fret: 


% = f n -1 ■ ■ (3-18) 

This means that f 3 depends upon the valué off 2 , which depends on the val ueof f 1( which depends 
upon the val ue of f 0 .1 n other words, 


h 


h 


18 

17 


18 

17 


18 

17' 


This means wecan compute f 3 interms of f 0 j ust by multiplying f 0 by (18/17) 3 . Now thatwesee 
the pattern, we can compute the frequency at the nth fret i n terms only of f 0 : 


f n 


(3.19) 


I n (3.19) the frequency of the nth fret depends only upon the frequency of the open string i nstead 
ofon thefrequency of the fret that carne before i t, so this equation i mplements a di rect calculation, 
nota recursiveone. If wesetf 0 = 440 Hz, then by either (3.18) or (3.19) the valué of f 3 comes out 
tobe 522.3 Hz. 

W here a di rect equival entto a recursive formula can befound, it is generally to be preferred. 

■ It avoids the problem of compounding errors in calculation. 

■ 11 i s general ly faster because we do not need to cal cul ate al I the val ues between the starti ng val ue 
and the valué of interest. 


This can bei mportant if, for example, wemust calcúlate val ues of afunction that are farfromwhere 
we started. 



M usical Scales, Tuning, and I ntonation 


0ne can often f¡nd a way to convert between recursiveand direct representations of a formula. 
For instance, we can write the rule for generating the equal-tempered scale recursively as 
follows: 

f„ = fp_! • 21 / 12 , 

and by similar reasoning, its directform is 
f n = f 0 ■ 2 n/12 , 

which is equivalent to equation (3.1). 

Theruleof 18 also descri besan iterativeprocess. Ifx„ representsthedistanceof thenthfretfrom 
the bridge, then the rule of 18 can be expressed as 



where kis a constant factor, either 18 or 17.81715, as discussed. Equation (3.20) says, "Thedis- 
tance from the bridge to the next fret (x n ) equals the distance from the bridge to the previous fret 
(x n _!) minus that di stance divided by k." Using (3.20) to compute the di stance from the bridge to 
the third fret, x 3 , we proceed as follows: 

*3 =*2- X j 



Assuming the distance from the bridge to the first fret is x 0 =1 m, and using the modified ruleof 
18 (ir = 17.81715), then x 3 =0.84. Noticethe interesting way theterms stack up in (3.21). These 
are cal led continued fractíons. 

3.16 DeconstructingTonal Harmony 

B ack when the Pythagorean scale ruled the day, the degrees each had a unique character and func- 
ti on, I i ke chess p¡ eces. T he asy mmetry of the scal e ori ented the ear as the musi c unfol ded. T he toni c 
degreewasking,andahierarchy of tonessurrounded it I i kecourtiers. The System was cal led tonal 
harmony. 



C hapter 3 


Even after the advent of the chromatic equal-tempered scale, composers persisted (as they still 
do today) in exploring functional harmony based on the expectations of Iisteners trained to hear 
the characteristic intervals of the diatonic scale. But the adjustments made over the centuries to 
facilítate transposition had the eventual effect of disconnecting the pitches from their harmonio 
function. 

By the end of the late Romantic era, functional harmonization had reached its expressive limits 
because, as its vocabularyexpanded, thelistener’s rootsin theold diatonic schemegraduallyweak- 
ened, until all thatwasleftwerethel2 pitches, all of which werenow equivalent both in function 
and in tonal palette. 

A century after the equal-tempered tuning system waswidely adopted i n the West, thecomposer 
Arnold Schoenberg and his associates (the so-called Second Viennese School) were inspired to 
extend the ¡dea of pitch equality further. They believed the oíd functional harmonio practices lin- 
gered on only as a historical artifact of the oíd just scales and should now be discarded. They 
devi sed atonal compositional strateg i es to remove key-centeredness from their music and so to 
thwart the ear's trained habitof organizing music harmonically. They eventually developed the 
12-tone compositional methodology by giving all pitches equal prominence (see section 9.10). 
Interestingly, this compositional motivation bearscertain resemblanees to political experimentsin 
radical democracy, communism, and social i sm that occurred in Europe around the same ti me. 
Alignments between political economy and musical aesthetics haveexisted throughout the ages, 
and transitionsin oneoften presageatransition in theother (Atali 1985). Plato noticed this effect 
long ago. H e said pessimistically, "A change to a new type of music is something to beware of as 
ahazard of all our fortunes. Forthemodesof music are never disturbed withoutunsettlingthemost 
fundamental political and social conventions" (Republic 424c). 

Here, onceagain, wearrive atthe nexus between society, aesthetics, and technology. It seems 
that the deconstruction of tonal harmony at the end of the Romantic era was the inevitable result 
of the avai I abi I i ty of effecti ve transposabl e key schemes. T hi s means that advances i n musi cal seal e 
engineering had profound reflexiveconsequenceson musical aesthetics. Circularly, the desi re for 
transposablekeyschemeswasoriginally motivated byaestheticrequirements, buttheconsequence 
of their development was a fundamental transformad on in aesthetics. 

Thus music takes its place in the pantheon of human pursuits: no activity is immune from our 
reflexiveand self-redefining capacities, which is perhapsourmost uniquecharacteristic asaspecies. 

3.17 Deconstructing the Octave 

Every true revolution encompasses the paradigm it overthrows, even as it supersedes it. The rev- 
olution of the Second Viennese School led to the deconstruction of tonal harmony, but the octave 
remained sacred. The revolution of the microtonalists led to the deconstruction of the chromatic 
seal e, but the octave I i kew i se remai ned sacred. T he octave has been an i nvari ant feature of vi rtual I y 
all historical scales because of octave equivalence, which is our tendeney to hear pitches played 



M usical Scales, Tuning, and I ntonation 


at octaves as functionally ¡dentical. The equival ence is felt so strongly that musical scales around 
the world are almost invariably organized around the2:1 ratio of the octave, and pitches related 
by octaves are virtually always given the same ñame. Octave equivalence is deeply rooted in our 
perceptual system (seesection 6.4.6). 

The invariance of the octave is hard-wired in equation (2.2), f x = f R ■ 2 X , xe R,becauseof 
theconstant2 in that equation. If wegeneralizeit, 

f x = f R - k\ k e I, xeR, (3.22) 

wecan construct scales that are notbound by the octave. (It is customary butnotstrictly necessary 
to limit k in (3.22) to be an integer.) The valué of k defines what I cali the compass interval. Let 
the compass interval be k x+ 1 :k x for any real x. For example, when k = 2, the compass interval is 
2:1, the octave. W hen k = 3, the compass interval is 3:1, an octave plus a fifth, otherwise known 
as atwelfth. Thevaluex istypically a rational fraction indicating a división of the compass interval. 
Fortheequal-tempered scale, x = k/12,wherek indexesa particular división of the compass interval. 

The inclusión of non-octave-based scales vastly widens thescale possibilities wemust consider. 
Flowever, therearetwo important characteristicsof octave-based scales that wewould do well to 
preservewhen evaluating the suitability of non-octave-based scales for musical purposes. Candi- 
date scalesshould have 

■ A high degreeof consonance for as many of the intervals as possible 

■ A high degree of internal order, that is, a regular pattern of steps and step sizes 

3.17.1 TheBohlen-PierceScale 

A non-octave-based scale that arguably meets the above criteria and has a number of other inter- 
estingfeaturesaswell wasdeveloped byseveral music researchers i n the latter part of the twentieth 
century. Heinz Bohlen (1978), an electronicsand Communicationsengineer withoutformal musical 
training (w hi ch fact was probably an asset to hi s accompl i shment) was the f¡ rst to consi der bui I di ng 
a scale from a triad not based on the familiar 4:5:6 ratios of the natural major scale, but upon the 
ratios 3:5:7 and the compass of an octave and a fifth. As the compass interval of 2:1 is calIed the 
octave, the compass i nterval ofthetwelfthwasdubbedthetr/'tavebyJohnPierce,whoindependently 
discovered this scale system (M athews, Roberts, and Pierce 1984; M athews and Pierce 1980). 15 

Because the scale is madefrom simple integer ratios that are harmonic by definition, it meets 
the f i rst criterion. But because it does not inelude an octave and dupl i cates but two of the 
octave-based just intervals, it is completely incompatible with any octave-based scale. Asforthe 
second criterion, itdoes have a high degreeof internal order. 

3.17.2 Constructing the Bohlen-PierceJ ust Scale 

We can construct this scale using the standard method of addi ng and subtracting intervals, begin- 
ning with the 3:5:7 triad. 



C hapter 3 


1. Takeas thefirst degreeof thescaletheunisón 3/3. Positioning the root of the3:5:7 triad on the 
first degree yields scale intervals 3/3 : 5/3 : 7/3. The tritave corresponds to 9/3 = 3/1, giving the 
degreesshown in figure 3.30. 

2. Startinganew rootonthe5/3, wecanspell anothertriad withtheratios5/3 : 7/3 : 9/3.This5:7:9 
triad isshown in figure 3.31. 

In the next two steps, we extend the scale to seven degrees. 

3. Transpose the 3:5:7 triad in figure 3.30 so that its top pitch equals the 9/3 (figure 3.32). The 
top of the figure shows the 3:5:7 triad rooted on the first degree, and beneath it is the transposed 
3:5:7 triad with its top pitch aligned with the tritave. Tofind the new root of the transposed triad, 


3 5 7 9 



Tritave interval 


Figure 3.30 

Bohlen-Pierce just scale, 3:5:7 triad and tritave. 


3 3 3 


5:7:9 Triad Tritave 

Figure 3.31 

Bohlen-Pierce just scale, 5:7:9 triad. 


3 5 7 9 



Transposed 3:5:7 Triad 


Figure 3.32 

Bohlen-Pierce just scale, 9/7 and 15/7 intervals. 



M usical Scales, Tuning, and I ntonation 


we subtractthe interval 7/3 from the interval of thetritave: 

3 * 2 - 3 . 

1 3 7 

Wefind the middle pitch by adding the interval 5/3 to the root: 

9 5 _ 45 = 15 
7 3 21 7' 

The root and middle pitches of the transposed 3:5:7 triad thus add two new scale degrees at 
9/7 and 15/7. 

4. Take the 5:7:9 triad from figure 3.31 and position its rooton thefirstdegreeof the scale. To do 
so, subtract the interval 5/3 from each interval: 

Í5.Z.9V 5 = 5.7.9 
v3 ' 3 ' 3y 3 5 ' 5 ' 5' 

We derive two new intervals this way, 7/5 and 9/5. 

Figure 3.33 shows the resulting scale. The largest prime is 7, so this is a seven-limit scale. 

3.17.3 Constructing the Bohlen-PierceC hromatic Scale 

Figure 3.34 shows the interval sizes of the Bohlen-Pierce just diatonic scale. Observe the sym- 
metrical arrangementaroundthefourthdegree. Itisuseful toclassify the sizes of intervals as small 


J_L 


Figure 3.33 

Bohlen-Pierce just diatonic scale. 


7 5^7 = 25 15^9 = 25 1 5 

1.29 3 + 5 21 7*5 21 1.29 



1 2 3 4 5 6 7 


Figure 3.34 

Bohlen-Pierce step sizes. 




90 


C hapter 3 



Figure 3.35 

Bohlen-Pierce step sizes on their sides. 


(1.08 and 1.09), médium (1.19), and large (1.29), shown in the figure from light to dark grey, 
respectively. Rememberingthatwearecomparingratios, itseemsthatthemedium interval isabout 
twice as large as the small intervals because 1.09 1 2 3 4 «1.19. A Iso, the sum of a small and a médium 
interval is about equal to the large one becausel.08 ■ 1.19 « 1.29. So these interval sizes are 
roughly in theorder 1:2:3. Wecan better visual ize their relative sizes if we lay the interval sover 
on their sides (figure 3.35). 

Wecould devisea chromaticscale from these ratios asfollows. First, we replace the large i nter- 
vals with the combinad on of a small and a médium interval. This leaves a scale contai ni ng only 
small and médium steps, analogous to the half and whole steps of the equal-tempered scale. Then 
we replace each médium interval with two small intervals, resulting in a scale containing only 
small steps, analogous to the equal-tempered semitone scale. 

1. Since (27/25)(25/21)=9/7, wecan exactly replace thetwo large (9/7) steps with thecombination 
of a small (27/25) and a médium (25/21) step. 

2. The existing small steps (49/45 and 27/25) needn’t change. They constitute semitones in the 
scale. 

3. Unfortunately, neither size small step exactly divides the médium step into two equal 
parts. For instance, subtracting a 49/45 semitone from a 25/21 whole step leaves 375/343. 
Si mi larly, subtracti ng a 27/25 semitone from a 25/21 whole step leaves 625/567. A Itogether 
then, we end up with four semitones from smallest to largest: 27/25, 49/45, 375/343, and 
625/567. 

4. We must choose the order in which wesubstitutethesmaller intervals into larger ones. Shall 
we break them up as {small, large} or {large, small}? Recalling the symmetry in the just 
Bohlen-Pierce scale in figure 3.34, wecan divide the intervals in a correspondingly symmetrical 
way. 

Applying these principies to the Bohlen-Pierce diatonic scale results in the Bohlen-Pierce 
chromatic just scale with ratios and step sizes as shown i n figure 3.36. The resultis a nicely sym¬ 
metrical scale of 13 degrees spanning the tritave. This is the scale originally worked out by 
Bohlen (1978). Figure 3.37 shows the Bohlen-Pierce chromatic scale and the natural chromatic 
scale for comparison. The only points of contact between the two scale Systems are the 1/1 and 
the 5/3 (major sixth). 



M usical Scales, Tuning, and I ntonation 



a)Bohlen-P¡ercechromat¡cscale i 


b) Natural chromatlc scale 


1 5 2 4 

1 3 Í 1 

Figure 3.37 

Bohlen-Pierce and natural chromatic scales compared. 


3.17.4 The Bohlen-Pierce Equal-Tempered Scale 

There is an equal-tempered versión of the Bohlen-Pierce chromatic scale, just as there is an 
equal-tempered versión of the natural chromatic scale. AII we must do to create it is to set k = 3 
and x = /c/13 in equation (3.22), yielding 

f k = f R ■ 3 kl13 . Bohlen-Pierce Equal-Tempered Scale (3.23) 

Asshowninf¡gure3.38,thedegreesoftheequal-temperedBohlen-P¡ercescalearemuchclosertothe¡r 
j ust counterparts than the octave-based equal-tempered scal e degrees are to thei r j ust counterparts. The 
equal-tempered Bohlen-Pierce scale has a goodness-of-fit metric of 81.56 cents to its chromatic just 
counterpart, compared to the 103.624 cents goodness-of-fit metric for the equal-tempered scale. 

3.17.5 E valuating the Bohlen-Pierce Scale 

G ¡ven the odd-numbered basis of the Bohlen-Pierce scale, Pierce suggested performing the scale 
using only timbres with odd harmonics, such as a clarinet, to help emphasizetheconsonance of 




C hapter 3 


a)Bohlen-P¡erceequal-tempered scale 


b) B ohlen-Pierce chromatic scale 



These are much el oser to each other than their chromatic counterparts are. 

Figure 3.38 

Bohlen-Pierce chromatic and equal tempered scales compared. 

the primary chords of the scale, and to help the usual expectation of octave equival ence give over 
to the experience of tritave equivalence. Bohlen created an electronic organ with a clarinetlike 
squarewave timbre to experimentwith the scale. 

Whether through some combination of neural wiring, or a lifetimeof conditioning, or both, it 
is hard for most Iisteners to hear past octave equivalence when Iistening to non-octave-based 
scales. How, then, can we objectively compare the consonance of the Bohlen-Pierce scale with 
other scales? Roberts and M athews (1984) proposed ¡ntonation sensitivity as a way of evaluad ng 
the perceptibility of consonance of achord.They defined ¡ntonation sensitivity as the way inwhich 
preferenceforachord vari es with thetuning or mistuning of the center note of thetriad. Their study 
determined thatthe4:5:6 triad had a high degreeof ¡ntonation sensitivity (as would beexpected) 
and thatthe ¡ntonation sensi tiviti esof the3:5:7 and 5:7:9 triadswerevery cióse tothe4:5:6. Indeed, 
they are more I i ke di atoni c maj or tri ads i n the way that preference vari es w i th tuni ng than di atoni c 
minortriads are. 

M athews and Pierce (1989) investigated the consonance of the various tri ads available in 
the Bohlen-Pierce scale. M usicians and nonmusicians judged the consonance/dissonance of the 
78 triads that can beformed in thespan of a tritave. Tones used odd harmonios only. They found 
that listeners scored the triads over a wide range, indicating that consonance is a salient property 
of the scale. 

M athews and Pierce also asked trained musicians and nonmusicians to judge the similarity of 
Bohlen-Pierce chords and octave-based just chords. Here, the respondents diverged in their rank- 
ings: whereas musicians and nonmusiciansalikejudged similarity primarily on pitch height, musi¬ 
cians also ranked inversions of octave-based diatonic chords as similar whereas the nonmusicians 
did not. M athews and Pierce concluded from this, "Itseems reasonable that training with the 
[Bohlen-Pierce] scale would makeit possi blefor listeners torecognize and respondtoitsstructure, 
just as trained musicians recognize and respond to the structure of the diatonic scale and diatonic 
chords." Richard Boulanger’s work Solemn Song for Evening is a fine example of the use of this 
scale system. 



M usical Scales, Tuning, and I ntonation 


93 


3.18 TheProspectsfor AlternativeTunings 

Certainly one liability of non-octave-based scales and of scales with other than 12 degrees per 
octave isfinding i nstruments to pl ay them. T he quarter-tone scal e, for i nstance, requi res two p¡ anos 
tuned a quarter tone apart. Interesting and elabórate keyboard constructions have been proposed 
or built by various theoreticians over the centuries for different scale Systems, both tempered and 
rational (Keislar 1988). Perhaps Parteh had themostimaginativeand ambitiousapproach with his 
orchestra of various ¡nstruments of his own design. 

B ut the probl ems are not j ust theoreti cal; they are al so economi c. I n order to construct a I i vi ng 
music one must have ¡nstruments, trained players, a body of musical work, and last but not least, 
an interested and financially involved public. AIthough Partch did what he could within thespan 
of his Iifetime to put his music on a sustainable basis, his ¡nstruments are now in danger of becom- 
ing museum pieces, rarely played in public. 

The advent of electronic and Computer musical ¡nstruments certainly offers a new opportu- 
nity for microtonality and non-octave-based scales (Wilkinson 1988). M usic synthesizer man- 
ufacturers sometimes inelude a means for microtonal experimentad on in their hardware. 
N umerous Computer music programs are available that allow precise frequencies to be gener- 
ated. H owever, this addresses only the i nstrument need and does not guarantee players, works, 
oraudience. 

3.19 Summary 

Intervalsmadefrom theratios of small wholenumbers are called thejust intervals. Some believe 
thatthejust intervals arósefirstfrom the harmonic series of musical ¡nstruments; others, that they 
aróse from the study of proportion by the Pythagoreans. 

Intervals are added by multiplying their ratios and subtracted by dividing their ratios. 

The cent scal e di vi des the octave i nto 1200 equal parts; each cent i s one hundredth of a tempered 
semitone. 

We can classify scales as to how many degrees they have per octave and whether they are tem¬ 
pered orjust. 

Thejust pentatonic scale, diffused throughout the world, is perhaps the oldest scale. The 
Pythagorean j ust scale is the prototypefor modern Western scales. Though it is highly desi rabie 
for the i nterval s of the scal e to be based on smal I i nteger rati os, I i ke the harmoni c seri es, some of 
the Pythagorean intervals are harmonically dissonant. 

The Pythagorean just scale can beexpanded to 12 degrees to facilítate transposition and mod- 
ulation, but we end up with two tritones and two sizes of semitone. 

Ptolemy suggested modifying the Pythagorean intervals to better suit what musicians actually 
played. H owever, his ideas w ere su pp ressed until the M iddle Ages. Inany event, this did not sol ve 
the fundamental problem of a transposable scale System with small integer ratios. 



94 


C hapter 3 


Consonance means "to sound well together." Dissonance is its opposite. Though it is tempting 
to look for consonance metrics in the mathematics of their ratios, the subject also depends upon 
culture and era as well as psychophysical response. 

The natural majorscaledeveloped by Zarlino wasbased on thepure4:5:6 ratio. Itsucceedsat 
maki ng the thi rds perfectly consonant, but i t does so at the expense of the whol e steps, whi ch now 
are uneven in size. 

Themean-tonetemperedscaleregularizedthesizeofthestepsinthenatural majorscaleusingtem- 
pering. B ut odd-sized intervals madethescaledegreesfaiI to lineup exactly with the harmonio series. 

The underlying problem with all justscales is thatthe powers of the integer ratios 3/2 and 2/1 do 
notform aclosed System. Itturns outthat 12 fifths above seven octaves ¡soné of the best approxi- 
mationsto a closed system, yielding a System with 12 degrees per octave, but itis nota closed System. 

To cióse the octave so asto allow arbitrary transposition and modulad on, wemust usetemper- 
ing. Orwecan throw out modulation and transposition and use ajust scale. Orwecan continué 
to add scale degrees in an effortto throw additional scale degrees at the problem, increasing the 
oddsthat some of them will be less dissonant. M ean-tone temperament optimizes only thethirds 
and fifths i n selected keys. Well-tempered scal es make al I keys usable but make some more purely 
intoned than others. Equal-tempered scal es make all keys sound the same. The ¡dea that different 
keys havea unique tonal palettestemsfrom the well-tempered scales, which actually did sound 
different in different keys. 

The original aimof microtonal tuning wastosupply alternative choices of intervals when mod¬ 
ulad ng or transposi ng so as to retai n as much as possi bl e the si mpl e i nteger rati os of the j ust scal es. 
Examples of tempered microtonal scales inelude the 19-tone scale, the quarter-tone scale, and the 
53-tonescale. Originally developed in the eighteenth century, just microtonal scalesdidn't catch 
onbecauseof thediffi cuides of constructing instrumentsto play them. M any cultures, suchas das- 
si cal H i ndustani musi c, are sati sfi ed not to transpose but i ncorporate 22 j ust mi crotones cal I ed sruti 
in the scale to provide a rich tonal palette. In the twentieth century, Harry Partch built an entire 
orchestrato play music using his 43-tone just microtonal scale. 

The hierarchical system of diatonic harmonicity began to break down after the equal-tempered 
scale provided free transposition to any key. All keys were now alike because all intervals were 
¡dendcal. With ¡dendeal keys, after a while, composers no longer felt the compulsión to obey 
the older tonal hierarchy. Arnold Schoenberg and his school devised a way to remove any 
key-centeredness from their music by giving all pitches equal prominence. The result was the 
deconstructi on of tonal harmony i n Western musi c. Thi s was fol I owed by the deconstructi on of the 
octave in the late twentieth century, for example, by the Bohlen-Pierce scale. 

Welive in an unbelievably rich time when all the musical traditions of the world, both current 
and historical, are availableto us, and we also have the means to construct new scales and build 
new Instruments to play. However, to construct a living music requires more than just a theory: 
Instruments, trained players, a body of musical work, and an interested and financially involved 
public are also necessary. 



M usical Scales, Tuning, and I ntonation 


95 


3.20 Suggested Reading 

Backus, John. 1969. TheAcoustical FoundationsofMusic. New York: W. W. Norton. 

Barbour, J. M urray. 1947. "Bach and the Artof theTemperament.” Musical Quarterly 33 (January): 64-89. 

-. 1948. "Irregular Systems of Temperament.” Journal of the American M usicological Society 1: 20-26. 

-. 1953. Tuning and Temperament: A HistóricaI Survey. East Lansing: M ichigan State College Press. 

Barnes, John. 1979. "Bach's Keyboard Temperament: Internal Evidencefrom the Well-Tempered Clavier." EarlyMusic 
7 (April): 236-249. 

Benade, Arthur H. 1976. Fundamentáis of Musical Acoustics. New York: Oxford University Press. 

Blood, William. 1979. "'Well-Tempering' the Clavier: Five M ethods.” Early Music 7 (October): 491-495. 

Bobbitt, Richard. 1959. "The Physical Basis of Intervallic Quality and Its Application to the Problem of Dissonance." 
Journal ofMusicTheory 3 (November): 173-207. 

-. 1980. "Das Wohltemperierte Clavier: Tuning and M usical Structure." English HarpsichordMagazine 2 (April): 

137-140. 

Bohlen, Heinz. 1978. "13 Tonstufen in der Duodezime." Acústica 39. English translation: "13 Tone Steps in the 
Twelfth." Acústica 87 (2001, no. 5): 617-624. 

Bosanquet, Robert. 1876. "M usical Intervals and Temperament." In Tuning and TemperamentLibrary, vol. 4. Ed. Rudolf 
Rasch. Utrecht: Diapasón Press, 1987. 

Carlos, Wendy. 1987. "Tuning: AttheCrossroads." Computer Music Journal 11 (1): 29-43. 

Carr, DaleC. 1974. "A Practical Introduction to Equal Temperament." Diapasón 65 (3): 6-8. 

Danielou, Alain. 1994. Music and the Power ofSound. Rochester, Vt,: I nner Traditions. 

Fletcher, N. H„ andT. D. Rossing. 1991. ThePhysics of Musical Instruments. New York: Springer-Verlag. 
HamiltonJamesA. 1844. Hamilton's Practical Introduction to theArtofTuning the Pianoforte. London: R. Cocks. 
Hellegouarch, Yves. 2002. "A Mathematical Interpretation of Expressive Intonation." In Mathematics and Art. Ed. 
Claude P. Bruter. New York: Springer-Verlag. 

Helmholtz, Hermann. 1863. On the Sensatlons of Tone. Second English edition, 1885. Trans. A. J. El lis based on the 
fourth Germán edition, 1877. New York: Dover, 1954. 

Jorgenson, Owen H. 1977. Tuning the Histórica! TemperamentsbyEar. M arquette: Northern M ichigan University Press. 
-. 1978. "In Tune with Oíd Tunings." Clavier 17 (November): 26-28. 

-. 1991. Tuning: Containing the Perfection of Eighteenth-Century Temperament, the Lost Art of Nineteenth 

Century Temperament, and the Science of Equal Temperament. East Lansing: Michigan State University Press. 

Kellner, Herbert Antón. 1979. "A Mathematical Approach Reconstituting J. S. Bach's Keyboard Temperament." Bach 
10: 2-9. 

Lindley, Mark. 1974. "Early Sixteenth-Century KeyboardTemperaments." Música Disciplina 28:129-151. 

Lloyd, Llewellyn. 1979. Intervals, Scales, and Temperaments. Ed. Hugh Boyle. New York: St. Martin's Press. 

Mathews, Max V., and John R. Pierce. 1980. "Harmony and Nonharmonic Partíais ."Journal of the Acoustical Society of 
America 68:1252-1257. 

Mathews, Max V., John R. Pierce, A. Reeves, and L. A. Roberts. 1988. "Theoretical and Experimental Explorations of 
theBohlen-PierceScale."./ourna/ of the Acoustical Society of America 84:1214-1222. 

M ekiel, Joyce. 1960. "The Harmonio Theories of Kimberger and M arpurg ."Journal of Music Theory 4 (November): 
169-193. 

Pierce, John. R. 1966. "Attaining Consonance in Arbitrary Scales." Journal of the Acoustical Society of America 
40: 249. 

Sargent, George. 1969. "Eighteenth-Century Tuning Directions: Precise Intervallic Determinations." Music Review 
30 (February): 27-34. 



96 


Chapter 3 


Sethares, W. A. 1993. "Local Consonance and the Relationship between Timbre and Seal e." Journal of the Acoustical 
Society of America 94 (3): 1218-1228. A Iso, without the math, in "Relating Tuning and Timbre." Experimental Music 
Instruments 9 (1993, no. 2) and at http://eceservO.ece.wisc.edu/~sethares/consemi.html. 

Slaymaker, F. H. 1968. "ChordsfromTones Having Stretched Partíais "Journal of the Acoustical Society of America 47: 
1469-1571. 

Werckmeister, Andreas. 1691. "M usicalische Temperatur." In Tuning and Temperament Library, vol. 1. Ed. Rudolf 
Rasch. Utrecht: Diapasón Press, 1983. 

Wilkinson, S. R. 1988. Tuning In. M ilwaukee: Hal Leonard Books. 



Physical Basis of Sound 


M usic is a Science which should have definite rules; these rules should be drawn from an evident principie; 
and this principie cannot really be known to us without the aid of mathematics. Notwithstanding all the 
experience I may have acquired in music from being associated with it for so long, I must confess that only 
with the aid of mathematics did my ideas become dear and did light replace a certain obscurity of which I 
was unaware before. 

— J ean-Phi I i ppe R ameau, Traite del'H armonie 

This book uses the international System of standard units defined by the Systéme International 
d'Unités, abbreviated SI. It is al so known as the M KS System of measurement, which stands for 
"meter, kilogram, second." This System is used almost universally by the scientific community as 
well asby mostcountriesof theworld exceptthe United States. Asan American, I may occasion- 
ally slip back intoold habits and use the so-cal led English "foot, pound, second" System. But since 
even the English have abandoned it, l'm trying to do so as well. 

4.1 Distance 

The fundamental SI unitof distance i s the meter. The SI system multipl i es the meter byexponents 
of lOto create other named magnitudes (table 4.1). N otice that from the mil li meter on down, the 
ex pon ent decreases by 3 for each succeeding unit. Units larger than the kilometer, such as the 
megameterand gigameter, will notarise much in thestudy of music and sound. 

4.2 Dimensión 

Vectors convey both a direction and a magnitude. Vectors are usually drawn as an arrow whose 
length represents the vector's magnitude and whose orientation indicates its direction. 

A coordínate system is any method of specifying points. A set of vectors set at right angles to 
each other defines the cartesian coordinates. A single such vector defines one-dimensional space, 
two vectors at right angles define two-dimensional space, and so on. 

Two vectors are orthogonal if they maxi mize the area they del i neate. T hree vectors are orthog- 
onal ifthey maximizethe volume they delinéate. 



C hapter 4 


Table4.1 

SI Unitsof Distance 


10 3 m = 1000 m 
10°m = 1 m 


Kilometer km 

M eter m 

Deci meter dm 

Centi meter cm 

M illimeter mm 

M icrometer pm 

N anometer nm 


MHm =0.1 m 
10 _2 m =0.01 m 
10-3 m =0.001 m 
10-6 m =0.000001 m 
10- 9 m =0.000000001 m 


Thousand 

(Littleused) 
H undredth 
Thousandth 
M illionth 
Billionth 


An area isthe productof two orthogonal distances; thearea of a circle \snr 2 . A volume isthe 
product of three orthogonal distances. The volume of a sphere is 4nr 3 /3. 

4.3 Time 

Sirlsaac Newton (1643-1727) provided thefirst published mathematical model of time in 1687 
in his PhilosophieeNaturalis Principia Mathematica, commonly known as The Principia. He 
model ed ti me as a I i ne that stretched conti nuousl y f rom the i nfi ni te past to the i nfi ni te f uture. Time 
was thus eternal, havi ng no begi nni ng and no end. T hi s approach to model i ng ti me makes the math- 
ematics of music and sound tractable, but it raises a number of problems. For instance, modern 
astronomy suggests that time had a begi nni ng and will possi bly have an end, dependí ng upon 
whether the universe will collapse, reach a steady State, or expand forever. But if time is eternal, 
how can it be limited by the duration of our universe? And if time is not eternal, then what was 
happening beforetime began? 

Such confusions provide an object lesson on the limitations of mathematical models. They are 
useful insofarasthey accurately characterizethe behavior of real Systems. Butin Science reality 
trumpsamodel'sview of reality. Scientific revolutionscomeaboutwhen the Iimits of a model are 
overeóme by a more encompassi ng model. N ewton's perspecti ve on ti me has the advantage of si m- 
plicity; it can still be used so long as we remain aware of its limitations. 

The fundamental SI unitof time isthe second. As with distance, the SI System createsother 
named magnitudes by multiplying the second by exponents of 10 (table 4.2), but SI time units 
greaterthan 1 second are notin decimal organization. Instead, wehaveyears, weeks, days, hours, 
and minutes. 

4.3.1 Period and Frequency 

There are two ways to use time as a measurement: 

■ Period The amountof timeT elapsed between the start and end of a single event i s the per/od 
of the event. W hen a train moves past at a constant speed, the time it takes for one car to pass by 



Physical Basisof Sound 


Table4.2 

SI UnitsofTime 

Kilosecond ks 

Second s 

Decisecond ds 

Centi second es 

Millisecond ms 

M icrosecond ixs 

Nanosecond ns 


10 3 s = 1000 s 
10°s = ls 
10" 1 s=0.1s 
ÍO- 2 s =0.01 s 
10-3 s =0.001 s 
10-6s =0.000001 s 
10-9s =0.000000001 s 


Thousand 

(Littleused) 
(Littleused) 
Thousandth 
M illionth 
Billionth 


a stationary observer is the car’s period. Analogously, the period of vibration, periodicity, is the 
time ittakes one cydeof awaveto return to itsstarting point. 

■ Frequency T he number of events f occurri ngin a single el apsed ti me interval is the frequency 
of theevent. The number of trains passing through a train station per day tells how frequently the 
trains run. 

The two measurement strategies are reciproca!, that is, for some frequency f and period T, 
f = j , Frequency (4.1) 

and conversely, 

T = )■ Period (4.2) 

Periodicity of sound istypically measured in seconds (s). Frequency is measured in eyeles per 
second. TheSI unitfor oneeyeleper second of vibration isAierf:z(Hz). Thestandard referencepitch 
for Western orchestras is A 440, correspondí ng to a peri odic sound vibrati on of 440 H z. The peri od 
of one eyel e of A 440 i s 1/440 = 0.00227 s, or 2.27 ms. 11 i s conveni ent to express f requenci es above 
1000 Hz in kilohertz; thus 1000 Hz = 1 kHz. 

4.4 Mass 

Togetherwith time and distance, abasic measurement i n the M KS system is mass, the quantity 
of matter contained in an object. Matter is anything that occupies space and exhibits inertia. 
Inertia is the tendeney of a body to impede acceleration. Your body presses against the seat of 
your car as you accelerate from a stop because the inertia of your body resists (impedes) the 
accel eration. Wecan compare one mass to another using, for example, a beam balance. Or we 
can measure it by applying a forcé and measuring the resulting accel eration. 




100 


C hapter 4 


M ass and weight are notthe same thing. M ass is a quantitative measureof i nertia. A s such, mass 
is an intrinsic quality of matter, unchanged by such things as the location of the object. Weight, 
on theother hand, is the forcé of gravity acting on the object, and itdepends upon theposition of 
the object with respectto other objectsaround it. For instance, you would weigh lessstanding on 
the moon than you do on the earth becauseyour weight depends upon your position. B ut the mass 
of your body and the forcé required to accelerate it at a certain rate are the same regardless of 
whether you are on the moon or the earth. 

The rest of the physical concepts in this section are derived from these three primary 
measurements. 


4.5 Density 


Density measures how tightly packed together the material in a body is. Density comes in one-, 
two-, and three-dimensional versions: 

■ Linear density describes mass per unit of distance, for example, the density of a rope or guitar 
string. For length / and massm, the linear density p is 


( 4 . 3 ) 


■ Area density describes mass per unit of area, for example, the density of a drum head. For mass m 
and area a, the area density y is 


y=j. ( 4 . 4 ) 

■ C ubic density i ndicates mass per unit of vol ume. F or mass m and vol ume v, the cubic density p i s 


P 


m 


( 4 . 5 ) 


Three-dimensional density is measured in kg/m 3 for large bodies or g/cm 3 for small bodies. 


4.6 Displacement 

In this book I have many occasionsto describe the motion of an object, such asa vibrating string, 
aircolumn, particleof air, or loudspeaker cone, so acareful explanation of motion isappropriate. 
Displacement, the most basic attri buteof motion, i ndicates distancefrom a starting point, or origin. 
Distance in and of itself has nothing to do with motion, but insofar as displacement relates to a 
starting position, it is an attribute of motion. When I use the term "displacement," it will always 
carry thistechnical sense. 



Physical Basisof Sound 


101 



Figure 4.1 

Displacement. 


Supposel starttakingamovieof acarwhen ¡tissomed¡stances 0 fromanarbitrary pointof ori- 
gin (figure 4.1). Thevalueof s 0 indi cates the di stanceof thecar from theorigin in the first frame. 
Successiveframes of the movie show the successive displacement of the car as it moves. The dif- 
ference between thecar's position in the second frame and its position i n the first frame is its dis¬ 
placement, As. The G reek letter delta (A) is used to signify that the variable to which it is attached 
describes a difference between other valúes, in this case, the difference between s and s 0 . We can 
also take Asto mean "theamount of changeins." Because A isso commonly used in this way, it 
is called the first backward difference operator, so that in general, if we have a measurementx n 
and a previous measurementx n _ 1( then 

Ax = x n - x n _ 1 . First Backward Difference (4.6) 

We can describe displacement as a vector. If we say that the entire distance that the car travels 
infigure4.1 i s the vector s, we can defi ne the displacement of the car from theorigin i n the second 
frame as s = s 0 + As. Rearranging, weget 

As = s - s 0 , Displacement (4.7) 

which says that the displacement As is the difference between the final position s and the initial 
position s 0 . 

The standard SI unitof displacement is the meter (m), but any SI unit of distance can be used 
so long as the appropriate conversión factors are used. 

4.7 Speed 

The ratio of distance to time is speed. M ore precisely, we speak of average speed as the distance 
traveled divided by the time elapsed. Here’swhy wemustcall itaverage speed: suppose it takes 




102 


C hapter 4 


a jogger five minutes to run three blocks and then walk two more. A Ithough moment-for-moment 
her speed is uneven, wecan still say that her averagespeed is one block per minute. 

Wecan express average speed v intermsof distances and time tas 

v = ^. AverageSpeed (4.8) 

A bar over a variable indicates that it represents an average. 

4.8 Velocity 

Velocity, I i ke speed, relates distanceto time. But velocity al so specifies direction; directionless 
velocity is just speed. For example, "the speed of sound" does not stipúlate a direction for the 
sound to travel. Speed is simply the magnitudecomponentof a velocity vector without respect 
to its direction. Velocity and speed are measured as the ratio of distanceto time, such as meters 
per second: m/s. 

Suppose the positionof the car atdisplacements 0 (figure4.1) correspondsto sometí me t 0 . Then 
if the car reachesdisplacementsat time t, we say that the el apsed time At is the differencebetween 
those times: 

A t=t-t 0 . Elapsed Time (4.9) 

The ratio of distance covered to el apsed time is the average velocity: 

17 = 77 = 7 —r- Average Velocity (4.10) 

At t-1 0 

If the displacement s - s 0 > 0, the velocity is positive, otherwise it is negative. (A velocity of zero 
istechnically a positive valué.) Ordinarily, positive velocity is indicated on the pageas going to 
the right. Note that there is no such thing as negative speed, so speed is always an unsigned valué. 

4.9 I nstantaneous Velocity 

C onsi der agai n the case of a j ogger w ho runs a f ew bl ocks and then wal ks a few. H er average vel oc- 
ity clearly does notgiveagood indication about her speed moment-to-moment. Itwould benice 
if we could determine the ¡nstantaneous velocity of the jogger at any particular moment. 

Suppose we have made a movie of the jogger as we did of the car. If we look at a single frame, 
the motion is arrested and we can't get a sense of her movement, but if we take the difference of 
her displ acement between two adjacent frames, we can. F or i nstance, if she runs past a meter stick, 
we can estímate her velocity during the moment between the two frames by measuring the dis¬ 
placement and dividing by the elapsed time. Suppose the camera snaps a picture every 1/24 of a 
second and the distance she covered was 0.1 m between frames; then her ¡nstantaneous velocity 



Physical Basisof Sound 


103 


is 2.4 m/s. Still, there may be somevariation in herspeed even during this time interval, however 
slight. Wecan generally reduce variation and improve accuracy by measuring velocity over 
smal I er and smal I er ti me i nterval s. 1 

Wecan continúan y refi ne an esti mate of velocity by looking atever smal ler i nterval sof elapsed 
time At by having the camera take successive pictures more rapidly. Since the distance the jogger 
covers between trames As will al so becorrespondingly smal ler, we begin to lose the big picture, 
but we do get a clearer picture of the jogger's velocity during the time between measurements. 

B ut at some poi nt we’ 11 reach the limit of the camera's fastest shutter speed, faster than which 
the camera can’tsnap successive images. In the limit when the time elapsed between adjacent 
movie trames (A t=t-t 0 ) is infinitesimal! y smal I, the distance the jogger covers (A s = s-s 0 ) 
will al so be i nfi nitesi mal ly smal I. But (and this is important) the ratio As/Af will not be infin- 
itesi mal ly small because it is a ratio of two small but nonzero valúes. 2 As we snap pictures at 
a faster and faster rate, and as both the time elapsed and the distance covered decrease, their 
ratio, which is distance divided by time, approachescloserand closerto thevalueof theinstan- 
taneous velocity. 

Supposewehave an unbel ievablyfast camera. When wehaveincreased the rate at which ittakes 
pictures so that the elapsed time is infinitely cióse to zero, we say we have actual ly reached the 
instantaneous velocity. We memorializethis by saying that the instantaneous velocity is 

v = lim —, Instantaneous Velocity (4.11) 

At— >o At 

which means "in the limitas A t approaches infinitely cióse to zero, the instantaneous velocity v 
equals the ratio of As/Af.” 

It'sworth thinkingforamomentaboutwhathappensifwegotoo farwith thisshrinkingprocess. 
Though itisclearly impossible, supposewe hada camera that couldtakesuccessivesnapshotswith 
zero elapsed time, that is, At = t- t 0 = 0. Then the jogger would have covered no distance, and 
As = s — Sq = 0. Then successive images of thejogger would beidentical; itwould belikelooking 
at the same picture. Wewouldn’tbeabletodistinguish any motion, thereby defeating thepurpose 
of the measurement. So for (4.11) to yield meani ngful results, wecan'tsay t = 0; we mustsay t 0, 
that is, t approaches zero (it just never quite gets there). 

I al ways found the idea of I i mi ts to be a si i ppery conceptto hang onto because the idea of a num- 
ber's approaching zero seems indefinite. A number either iszero or it is notzero, right? As I was 
writing thissection I remembered aZen meditation practicewherethenoviceisinstructed to med¬ 
ítate upon the"middle distance," that is, to focus on the space that is not too cióse ñor too far away. 
I recommend a variation of that perspective here. If wecan just let At in equation (4.11) become 
infinitely cióse to zero without reaching it, many otherwise unexpected truths can emerge— a sort 
of mathematical satori! 

Wecan define instantaneous speed to mean "magnitudeof the instantaneous velocity." In this 
book, when I say "speed," I mean i nstantaneous speed, and when I say "velocity," I mean instan¬ 
taneous velocity. 




104 


C hapter 4 


4.10 Acceleration 


When the velocity of an object increases, we say it accelerates. When its velocity decreases, we 
say it has negative acceleration, or decelerates. Average acceleration is change in velocity per unit 
of time: 

a = |. AverageAcceleration (4.12) 

It is "average" because we are averaging the acceleration over a time interval. 

Substituting for v from equation (4.8), we have 

, _ p _ s/t 
3 TT 

so acceleration is 


a 


t 2 ' 


(4.13) 


which means(assuming standard units) average acceleration is measured in meters per second per 
second, or m/s 2 . 

Wecan tie average acceleration to particular velocities as follows. Given a starting velocity v 0 
and a final velocity v measured over an elapsed time interval A t, we obtain average acceleration: 

a = At = FT 5 ' Average Acceleration (4.14) 

The average acceleration a isa vector that points in the same di rection asAv. 

Wecan define instantaneous acceleration as 


a = lim —. I nstantaneous Acceleration (4.15) 

At—>0 At 

T hus, i nstantaneous accel eration i s the I i mi ti ng case of the average accel erati on. I n thi s book, w hen 
I say "acceleration," I mean ¡nstantaneous acceleration. 

4.10.1 Acceleration as the Bending of a C urve 

Let’s say that the car wefilmed (figure 4.1) was accelerating. After filming it, we put the film in 
aprojectorand start viewi ng it. The projector shows us the frames i n the orden =0,1,2,..., with 
elapsed timeAt between each one, so weview themotion at the same speed itwas filmed. 

As we watch the car accelerate away from the origi n, suppose I suddenly stop the projector at an 
arbitrary frame/' so that now weseethecarfrozen in timeatsomemoment t¡ = i ■ At. Now wecan 
determine the car’s average acceleration from just this frame, the previous one, and the next one. 



Physical Basisof Sound 


105 



Figure 4.2 

Displacement of a car accelerating from a stop. 

Suppose B is the displacement of the car at the moment I stopped the film. The previous frame 
wasframe/- 1, at ti me í,-_ x = t, - At. Cali A the displacement of the car atthat moment. The next 
framewewould seeif westarted theprojectoragain isframef +1 attime t i+1 -t¡ + At. Cali C the 
displacement of the car at that moment. F igure 4.2 shows a graph of the car’s displacements A, B, 
and C at the moments t¡.i, t¡, and t i+1 . 

If we let u¡ be the displacement at B, then the displacement of the car at points A and C would be 
u ¡_i and u¡ +1 , respectively. N ow, thedifferences between thesedisplacements can be named asfollows: 


Backward difference 

Au ab =u / -u / _i 

D i splacement from A to B 

Forward difference 

Au bc = u/+i - “i 

Displacement from B to C 

Central difference 

Au AC =u¡ +1 - U¡_! 

Di spl acement from A toC 

If we relate these differences to the elapsed ti me between the appropriate poi nts, we can figure 
out the average velocity of the car during the three frames: 

Backward velocity 

U¡ - U¡ 1 

Average velocity from A to B 

Forward velocity 

u i+1 - u ¡ 

7 bc - 7 r 

c /+l H 

Average velocity from B to C 

Central velocity 

U¡ + 1 - U¡ ! 

Average velocity from A to C 

" AC_ 

So wehave three slopes correspondí ngto the average velocity between the three poi nts (figure4.2). 3 

Figure 4.2 shows that the backward velocity is a shallower slopethan the forward velocity: this 
shows that the car must be accelerating. In fact, the average acceleration is just the difference 


between these two slopes divided by the elapsed time. RecalI from equation (4.14) that average 



106 


C hapter 4 


acceleration is the difference of two velocities over time, and velocities are slopes on a graph of 
displacement vs. time, so accel eration is the amount of change in the slope of a curve over time. 

I ntuitively, the instantaneous accel eration at point B (figure 4.2) is the amount that the curve bends 
atthat particular point. The morea curve isbent,thegreatermust be the accel eration along it. Don’tfor- 
getthis. If II comein very handy when weconsiderthevibration of strings, reeds, bars, and membranes. 

4.10.2 Estimating Instantaneous Acceleration 

We’veseen that weneedjustone observad ontomeasu re anobject's displacement from its origin, and 
two tomeasureitsaveragevelocity. Butittakesthreeobservationstomeasureitsaverageaccel eration. 

Wecan estímate the instantaneous acceleration of an object where we havethree observations 
separated by a fi ni te ti me difference. (By "finite" I mean not infinite and not infinitesimal.) Refer- 
ring again to figure 4.2, if we have three displacements u t _ At , u t , and u t+At at points A, B, and C, 
separated by the ti me i nterval At, then the acceleration at point B is approxi mately 


u t+At - 2u t + u t _ At 
At 2 


Second-Order Central Difference Approximation (4.16) 


The origi ns of (4.16) and why it is an approximation and not an equality, go beyond the scope of this 
book. B utthis approxi mati on w¡ 11 come i n very handy when we study the vi brati on of stri ngs and ai r. 


4.11 Relating Displacement, Velocity, Acceleration, and Time 

Having developed the concepts of displacement, velocity, acceleration, and time separately, we 
can now combinethem to understand all aspectsof themotion of an object traveling with constant 
accel eration along a straight Une. To simplify thi ngs a bit, assumethattheobjectstarts accel erating 
from the origin s 0 = O, so now the displacement As = s-s 0 =s. 

Since we're only considering constant acceleration here, the average acceleration equals the 
instantaneous acceleration, that is, a = a. 

4.11.1 Velocity under Constant Acceleration 

Supposethecar depicted in figure 4.2 has initial velocity v 0 and constant acceleration a, and we 
wantto know its final velocity vafter elapsed time t. Since equation (4.14) relates all these vari¬ 
ables, a = (v- v 0 )lt, wejust haveto solve (4.14) for vto get the final velocity: 

v = at+v 0 for constant acceleration. (4.17) 

4.11.2 Displacement under C onstant Acceleration 

We can get the displacement of the car by solving (4.10), v = (s - s 0 )/(t -1 0 ) = s/t, for displace¬ 
ments: 


s = tv for constant acceleration, 


(4.18) 



Physical Basisof Sound 


107 


butinordertosolvethis,wemustknow whattheaveragevelocity v is. Well,wehavethefinal velocity 
vandtheinitial velocity v 0 ,soclearlytheaveragevelocity v mustbetheaverageofthesetwovelocities: 


^ = v + f or cons tant acceleration. (4.19) 

(Note that (4.19) only applies when the acceleration is constant.) Now wecan determine the dis- 
placement of the car by substituting (4.19) into (4.18) to get 


_ t(v + v 0 ) 


(4.20) 


With equations (4.14), (4.17), and (4.20) wehave Solutions for acceleration, velocity, and displace- 
ment of an object when acceleration is constant. By suitable choice of terms, we can use these directly 
or combine them to solve any problem involving constant acceleration. For instance, none of these 
equations directly deais with findingdisplacement when only acceleration, ti me, andinitial velocity are 
known. Butwecanfind iteasily enough. Startwith (4.20) and substitute into itthevalueof vfrom (4.17): 


t(v + v q) 


f[(at + v 0 ) +v 0 ] 
2 


(4.21) 


for constant acceleration. Using (4.21), we only need initial velocity, acceleration, and time to 
determine displacement. Equation (4.21) has the interesting property that it can tell us the dis- 
pl acement even w hen there i s no accel erati on. T he fi rst term v 0 t gi ves the di spl acement i f the accel- 
eration is zero and velocity remains constant at v 0 , and the term at 2 /2 gives the additional 
displacement that results from nonzero acceleration. 

O ne other combinationof these vari ables wi II prove useful later. First, solving (4.17) for tyields 
t=(v- v 0 )/a, and then using thisdefinitionfortín (4.20) yields 


v - Vq 

2a ' 


v + vo 
2 


(4.22) 


This yields displacement if wedon’t know timebut do know acceleration and the initial and final 
velocities. Finally, solving for v 2 yields 

v 2 = 2as + Vq. 


(4.23) 



108 


C hapter 4 


4.12 Newton’s L aws of M otion 

Suppose a small rocket plus its propellant has 1 kg of mass. The mass of the rocket’s propellant 
isa ti ny fraction of the mass of the rocket, so we can neglect the fact that, as it burns, the rocket 
contai ns less mass through ti me. N ow send i t i nto deep space so as to effectively el i mi nate fri cti on 
and the effects of gravity on its movement. W hen the rocket’s engine is ignited, it supplies a con- 
stant forcé and the rocket moves away in a straight line. A s its propellant is expelled, the mass of 
the rocket decreases ever so si ightly, but it is such a smal I change that for our purposes the rocket's 
mass remai ns vi rtual ly the same. Si nce we know the mass of the rocket, we can measure the forcé 
that the engine applies by measuring the rocket's acceleration per unit of time according to the 
equation 

F=ma, Newton's Second Law ofM otion (4.24) 

whereforcé isF, acceleration isa, and mass ism. Equation (4.24) is known as Newton’s second 
law of motion. 

If a mass weighing 1 kg is accelerated by one meter per second per second (1 m/s 2 ), then the 
strength of the forcé is said to be 1 newton (N). 

W e can derive addi ti onal i nformati on about accel erati on by rearrangi ng (4.24) as a = F lm. T hi s 
says that accel erati on i ncreases as F i ncreases and shri nks as m i ncreases. For example, if the pro- 
pellant constituted a substantial amount of the mass of the rocket, then as the propellant was 
expelled, the rocket's mass would decrease and its rate of acceleration would correspondíngly 
increase. 

Wecan relate the concepts of forcé, mass, and motion as follows. When the rocket engine has 
expelled all its propellant, its acceleration wi II becomezero. Butbecausethereis virtual ly nofric- 
tion or other forcé in deep space, the rocket continúes travel i ng indefinitely in the same direction 
at its final velocity. 

This is known as Newton's fírstlaw of motion, which can bestated as follows: 

An object continúes in a State of restor motion ata constantspeed along a straight line unless 
compelled to change that State by a net forcé. 

"Net forcé" means the sum of all forces acting on the object. Now, a greater forcé is required 
to change the di recti on of an obj ect w i th greater i nerti a. N ewton's fi rst I aw i s sometí mes al so cal I ed 
the law ofinertia when expressed as follows: 

Inertia is the natural tendency of an object to remain at restor in motion ata constantspeed 
along a straight Une. 

The mass of an object is a quantitative measure of inertia. 

Newton's third law of motion is often stated as follows: 

For every actíon, there is an equal but opposite reaction. 



Physical Basisof Sound 


109 


Here "action" means forcé and "reaction" means forcé in opposition. For example, suppose an 
astronaut suspended i n space pushes agai nst the si de of a satel I i te w i th forcé F. T he satel I i te pushes 
back with a forcé-F, that is, with aforceequal in magnitude but opposite in direction. Afterthe 
forcé i s expended and they are no I onger touchi ng, the satel I i te and astronaut move away f rom each 
other at a rate proportional to their relative masses. 

4.13 Typesof Forcé 

Forcé ¡san action in a particular direction upon an object, such as a push ora pulí. The net forcé 
onanobjectisthecombination of al I forces upon it. Theeffectsof forcecan beseenwhen an object 
accelerates, decelerates, twists, or deforms. Forcé is measured in units of newton. Types of forcé 
include gravity, friction, air resistance, turning (as in a screw), pressure, normal forcé, buoyancy, 
and tensión. 

Forces can be categorized as to whether they are contact forces or noncontact forces. G ravita- 
tional, electrical, and magnetic forces are examples of the latter because they can be effective 
whether the forci ng object and theforced object are touchi ng or not. 

4.13.1 Weight 

Weight is the forcé exerted by gravity. The forcé of gravity F g on a mass m is 

F g = mg, ForceofGravity (4.25) 

which we know from (4.24) is expressed in terms of acceleration. Acceleration dueto gravity at 
sea level is about g = 9.8 m/s 2 . So, according to (4.24), if we have a mass of 1 kg, for example, the 
forcé of gravity exerted on it at sea level will be 

f=m-a 

Fg= 1 ■ 9.8 = 9.8 N. 

4.13.2 Normal Forcé 

Inmathemati es, normal means perpendicular toa plañe. So anormal forcé ¡soné that is perpendicular 
to surfacesthat are in contact. For example, a weighing scaleexerts a forcé opposi teto gravity until 
the spring forcé of the scale and the forcé of gravity balance, and the scale supports the object. The 
normal forcéis a function of the electrical forces between charged particles withi n the atoms of the 
compressed springs. Wemeasure the weight of a body by observing thestrength of the normal forcé, 
indicated by theamountthespringsaredeformed. If thesupporting surfaceis inclined, or if itisaccel- 
erating, the normal forcé will not necessarily correspond to the weight of an object. 

4.13.3 Frictional Forcé 

Suppose we must push a heavy wooden box acrossa rough concrete floor. A tfi rst, even if we push 
hard, itmight not budge because the box ispushing back with anequal butoppositesfat/cfr/ct/ona/ 



110 


C hapter 4 


forcé. Once in motion, its opposition to movement is called th esliding frictional forcé, or kinetic 
frictional forcé. 

The static frictional forcé isoften greaterthan the correspondíng sliding frictional forcé. Less 
forcé is needed to keep it moving than is needed to start it moving. 

If a car is driven with its emergency brakeset, it's hard to get it rolling becausethe brakes grab, 
but once it attains some speed, the engine seems to have less work to do. If it slows to a stop, at 
some point the more powerful static frictional forcé takes over and the car abruptly halts. 

This isa good explanation of how a violín string vi brates under the influence of a bow. If the 
bow is stationary, a powerful static frictional forcé sticks the bow and stri ng together. As the bow 
moves, it drags the stri ng with ituntil the el astic forcé of the string overcomes the static frictional 
forcé. T he I esser si i di ng fri cti onal forcé takes over as the stri ng g I i des back opposi te to the di rec- 
tion of the bow. W hen the el astic forcé is spent, the string slows. A s i t slows, the more powerful 
static frictional forcé kicks in again and entrains the string with the movement of the bow. 

Friction isa nonlinearforcé because itdoes not vary uniformly with the velocity of theobject 
buttendsto increase at low velocities. Theforces of friction area resultof atomic-level interactions 
between the sliding surfaces. 

4.13.4 Tensión 

T he tensi on of a gui tar stri ng can be thought of as a forcé that seeks to pul I the two ends of the i nstru- 
ment together. Indeed, in older guitars the strings sometí mes bend theneck, pulling thenuttoward 
the bridge in a shallow bow. Alternatively, we can think of tensión as the tendency of the string to 
bepulled apart. Oneend of thestring appliesaforceT to theguitar, and asdictatedby Newton’sthird 
law, theguitar appl i es a reacti on forcé - T to the stri ng. T he same i s true of theother endof thestring; 
henee the tensión tends to pulí the string apart. Tensión i s al so a result of atomic-level forces. 

4.14 Work and Energy 

If I apply a forcé F to liftsomething a distances off thefloor, then I have performed worktocoun- 
teract gravity. If I depress a string on a guitar in order to pluck it, I have similarly performed work 
to counteract the string's forcé of tensión. 

Work is the forcé applied to move an object times the distance it is moved. 

M athematically, 

1/1/ = Fs. Work (4.26) 

If thereis no distance covered (s = 0), then no work is done. Thus, if I try to lifta piano and can't 
budge it, I may exhaust myself, but I 've done no work. 

Forcé and distance are measured in newtons and meters, respectively, and since work is the 
product of thesetwo, it is measured in newton-meters. Fortunately, this rather unwieldy unitfor 



Physical Basisof Sound 


111 


work has been given a simpler ñame in the SI System: the joule (J), named after James Joule 
(1818-1889) for his research on work and energy. 

4.14.1 Kinetic Energy 

E nergy is the ability to do work. W hen a forcé performs work on an object, the result is a change 
in the kinetic energy of the object. Kinetic energy is energy due to movement. The work done by 
the net torces on an object equals the change in the kinetic energy of the object. Indeed, wecan 
say in general that for kinetic energy E k and work W, 

E k = W ; (4.27) 

for all intents and purposes, work and kinetic energy are identical. The SI unitfor energy of all 
typesisalso the joule (J). 

If we substitute (4.26) into (4.27), wehaveE =Fs. Now, if we substitute (4.24), (F =ma), into 
this wegetE = mas. And by (4.13) wecan rewritethisto read 

s s 2 
E = mas = m^s = . 

But by (4.8), 

s 2 2 
E = mp = mv . 

This means that kinetic energy E is the product of an object's mass m times the square of its 
velocity v, or 

E =mv 2 . Kinetic Energy (4.28) 

From (4.28) wesee that 

Kinetic energy is proportional to the square of velocity. 

For instance, when a car's velocity doubles, its kinetic energy quadruples. Suppose it is going 
30 mi I es per hour and takes 30 feet after braki ng to come to a compl ete stop (that bei ng the di stance 
ittakes to completely dissipatethe motion energy into heat). Then, if itsspeed doubles to 60 miles 
per hour (assuming the same road conditions), it will takefour times as long to stop (120 feet) 
because the car has four times the amount of kinetic energy to dissipate. 4 

There are many forms of energy, including electrical, thermal, Chemical, radiant, nuclear, and 
mechanical. Acoustical energy isa kind of mechanical energy. 

4.14.2 Potential E nergy 

K inetic energy is measured by its mass and velocity, as in equation (4.28). B ut an object may also 
posses potential energy simply by virtue of its position. Like kinetic energy, potential energy 



112 


C hapter 4 


representstheabiIity todowork,butonly potentially.Oncepotential energy isenabledtoperform 
work, i t becomes ki neti c energy. F or exampl e, an obj ect suspended i n the earth's gravitad onal fi el d 
has gravitational potential energy by virtue of its position with respect to the earth. The greater 
the height, the greater its potential to do work if it were to be released and allowed to drop. If it 
is allowed to fall, its potential energy changesto kinetic energy in proportion to the height of its 
drop. Recalling that the forcé of gravity F g =mg, we can say that the potential of an obj ect to do 
work because of the forcé of gravity is 

W g = mgh, Gravitational Work (4.29) 

whereh is the height of the obj ect. Notethatwhethera ball rollsdown ahill or falls vertically, the 
work done by gravity would be the same, because only a change i n vertical distance can be attri b- 
uted to the forcé of gravity; any other motion would have to be attributed to another forcé. 
Wecan define the gravitational potential energy as 

E p = mgh. Gravitational Potential Energy (4.30) 

There are many other forms of potential energy besides gravity. A stretched or compressed 
spring has elastic potential energy. A string under tensión has tensile potential energy. 

4.15 Internal and External F orces 

F orces are either i nternal or external. External forces can increaseor decrease the avail able energy 
inasystem because the forcé comes from outside the System. If the external forcé is positive, the 
system’s energy will increase; if it is negative, its energy wiII decrease. External forces inelude 
applied forcé, normal forcé, tensional forcé, frictional forcé, and air resistance. The System expe- 
riencing the forcé may increaseor decrease either orboth kinetic or potenti al energy. F or i nstance, 
friction always acts as a negative forcé to reduce the total energy in a System. 

Internal forces cannot change the total energy of a System, but they can change kinetic energy 
into potential energy, and vi ce versa. Internal forces inelude gravitational forcé, magnetic forcé, 
electrical forcé, tensileforce, and spring forcé. For ¡nstance, when the forcé of gravity displaces 
an obj ect from a high location to a lower one, some of its potential energy is transformed into 
kinetic energy, but the total energy remains the same. I tsmovement under the influenceof gravity 
will undoubtedly includefriction and other external forces, but the change dueto the i nternal forcé 
of gravity (or other ¡nternal forcé) does not itself change the total amount of energy. 

4.16 TheWork-EnergyTheorem 


The total mechanical energy E of a System is the sum of its potential and kinetic energy: 

E = E k + E p . Total Mechanical Energy (4.31) 



Physical Basisof Sound 


113 


Anexternal forcéappliedtoanobjectcanchangethetotal mechanical energy ifworkisdone(that 
is, if there is displacement, and the displacement is directly related tothe appl ied forcé). If thework 
is positive, energy is added to the System; if it is negative, energy is taken away. The gain or loss 
in energy may be either kinetic or potential energy, orboth. For instance, a rocket traveling upward 
gains gravitad onal potential energy; if it isaccelerating, it isal so gai ni ng kinetic energy. Work done 
in thesecircumstances equals thechange in mechanical energy, both potential and kinetic. 

When work is performed on an object only by internal forces (such as springs or gravity) the 
total mechanical energy of the System is unchanged. But the energy changes form: some kinetic 
energy will be converted to potential energy, or viceversa. For example, a marble rolling down 
the insi de of abowl loses potential energy as gravity drawsitdownward, but i ts kinetic energy cor¬ 
respondí ngly increases as it drops. This is reversed when the marble climbs thefar sideof thebowl. 
Similarly, in the momentthat a vibrad ng piano string isat restatits pointof máximum stretch, ¡ts 
energy is only potential; but when the tensión forcé starts to pulí it back, this potential energy is 
converted to kinetic energy. W herethere is no change i n total mechanical energy, thetotal mechan¬ 
ical energy i s sai d to be conserved. 

4.17 Conservativeand Nonconservative Forces 

A nother way to el assify ki nds of forcé i s w hether they dissi pate energy or not. C onservati ve forces, 
as the ñame implies, do not dissi pate energy, whereas nonconservative forces do. Conservative 
forces store energy that can be retrieved later. For i nstance, if I rollaban upaslope.l i ncrease its 
potential energy. If later it rolls back down again, it does so by converting the previously stored 
potential energy to kinetic energy. Thus, gravity is a conservative forcé. Other examples of con¬ 
servative forces inelude the elastic forcé of springs, momentum, and the electrical forcé between 
charged partid es, becausetheseformsof energy can be stored and recovered. A wound spring can 
unwind; a charged battery can discharge through a Circuit; momentum given to a ball by a toss can 
be del ivered into the hands of the person catching it; and so on. 

Nonconservative forces dissi pate ortransmit energy. Nonconservative forces includefrictional 
forces, viscous forces such as air resistance, and propulsive forces. 

A swith gravity, potential energy aswell as kinetic energy can be associ ated w i th all conserva¬ 
tive forces. Kinetic energy of motion can be converted into potential energy of position, for 
¡nstance, when a moving object coasts up a hill, and can then be converted back to kinetic energy 
when it rolls down again. W hile kinetic energy E k and potential energy E p may be interconverted 
or transformed i nto each other, total energy i s preserved, and £ = + E p accordi ng to the principie 

of conservation of mechanical energy, provided no work is done by nonconservative forces. 

Note that when a mass is lifted against gravity, the potential energy increases regardless of the 
direction i n which the mass is raised (solong as itis upward). And when thestored potential energy 
is released by loweri ng the mass, the path downward does not matter: potential energy due to grav¬ 
ity is measured only by the difference in height. Ifthe work done is independentofthe path the 
motion takes, the forcé is conservative. 




114 


C hapter 4 


Another way to describe conservativeforces is to consider a car on a hilly closed loop. Dis- 
counting friction, thepotential energy gained going up exactly balancesthepotential energy going 
downwhen thecargetsbackto thestart. Discountingfriction, no networkis done by a conserva Uve 
forcé on a closed loop. 

Only kinetic energy can be associ ated wi th nonconservati ve forces. Fornonconservati ve forces, the 
work done depends on the path the motion takes: the longer the path a si ¡di ng object takes, the greater 
the forcé dueto fri ction. Becausenoworkcan be stored i n sucha forcé, potential energy isnotdefined 
for i t. The energy may dissipate, for i nstance, as heat, or be transmi tted, for instance, viasound waves. 

The energy a musical instrument receives from a performer, such as when a string is struck on 
a piano, is a nonconservati ve, propelling forcé because energy leaves the performer. The instru- 
ment receiving the energy seeks to return to its original energy level by dissipating it. But not all 
energy is immediately dissipated; some energy is stored in conservative forces, which work to 
víbrate the string. The energy that dissipates from the string enters the sound board, is transmitted 
into the air, and arrives at our eardrums. 

A s with this musi cal exampl e, both conservative and nonconservati ve forces typical ly combi ne 
in everyday situations to produce a net forcé on objects. The total work 1/1/ done by this net forcé 
is the sum of conservative work 1/1/ c and nonconservative work ]N nc , and 

1 / 1 / = \N c + \N nc . 

A ccordi ng to the work-energy theorem, the work done by the net forces on an obj ect equals the 
change in the kinetic energy of the object, so we can al so write 

E k = W c + \N nc . 

4.18 Power 

Power iswork done per unitof time. Thus, italso measurestherate at which energy is transferred 
or transformed. The average power P is the ratio of work 1/1/ to time t: 

P-j or P = y . Average Power (4.32) 

TheSI unitof power isJ/s = W (joules/second =watt) in honor of James Watt (1736-1819). 
For instance, if ittakes 1 second to transfer 1 jouleofenergytoliftabody up 1 m,then the power 
expended is 1 watt. A 100 W lightbulb uses in 1 second the same amount of energy as would be 
required to lifta mass that is pulled to earth by 1 newton of forcé 100 m up in the air in 1 second. 

4.19 Power of Vibrating Systems 


K i netic energy tends to dissipate from where there is more to where there is less. So to perpetúate 
a vibration requires a way to replenish its energy. This fact leads to an important distinction 
between different kinds of musical instruments. 



Physical Basisof Sound 


115 


4.19.1 C lasses of M usical I nstruments 

There are two classes of musical ¡nstruments: 

■ Sustaining ¡nstruments receive energy continuously from the player and produce a continuous 
output. The performer of an Australian didgeri-do performs circular breathing, and a pipeorgan 
has a motorized wi nd chest to supply a sustai ni ng tone. W i nd i nstruments can sustai n for the dura- 
tion of a player’s breath, and practiced string playerscan sustai n tones i ndefinitely by periodically 
reversing bow direction. 

■ Nonsustaining ¡nstruments receive energy from the player only at note onset. Their sound lasts 
until the energy i s dissi pated. Examples includethe piano, guitar, banjo, and mostpercussion. Dis¬ 
si pation by natural frictional forces causes the sound to die away slowly; the player can usually 
stop the note more rapidly if desi red by i ncreasi ng the rate of dissi pation, for example, by resting 
the hand on a vi brati ng stri ng. 

4.19.2 Efficiency 

E fficiency isthe ratio of useful power output p 0 to total power inputp,. To express efficiency e as 
apercentage, wecanwrite 

e = 100 ■ — . 

P¡ 

Efficiency determines, among other things, theease with which a brass or wind instrument can 
be made to speak. F or exampl e, a trumpet is very efficient when the vi brati ng frequency of the per- 
former'sl i ps matches oneof the trumpet's natural vi brati ng modes. Itisvery inefficient at al I other 
frequencies, which helps produce a stable tone. 

On the other hand, a piano sounding board is designed to have about the same efficiency for 
every frequency so that no onefrequency is favored over any other. G ood Ioudspeaker Systems are 
also designed to have the same efficiency atall frequencies so they don’t favor one frequency over 
an other. 

These ¡deas are developed more ful ly in volume 2, chapter 6. 

4.19.3 Power of M usical I nstruments 

Most musical ¡nstruments are notas powerful asalOOW lightbulb. H ere are some examples: 


Orchestra 75 W 

Percussion 1-10W 

Trombonefortissimo 6W 

Piano 1W 

Violín 0.1 W 

Violín pianissimo 0.001 W 




116 


C hapter 4 


Why, then, areaudio poweramplifiers buiItto generateon theorderof 300 to 600 W perchannel? 
It's because loudspeakers are very ineffident, on the order of 1-10 percent, so that they wiII not 
color the sound by emphasizing selected frequencies. 

4.20 WavePropagation 

Waves propágate in a médium by displacing differences in forcé or pressure from one place to 
another. T he crest i n the rope i n figure 4.3 moves along the rope by displacing the forcé that shook 
it. An acoustical wavesuch astheoneshown in figure 1.2 propagates in air by displacing pressure 
differences from one space to another. 

There are threeways that waves can propágate through a médium. The differentways relate the 
direction of motion of the partióles in the médium to the direction of wave motion through the 
médium: 

■ Transverse Like surface waves in water, the direction of motion that creates the wave is per¬ 
pendicular to the direction of wave motion. If wetiearopetoawall atoneend and shaketheother 
end, we might see shapes as shown in figure4.3. 

■ Longitudinal L ike sound waves i n air and under water, the direction of motion that creates the 
wave is the same as the wave motion. Figure4.4a showsaspring at rest. I n 4.4b its I eft end isgiven 
a momentary shove right, creating a compressed región. I n 4.4c it is given a momentary shove left, 
creating astretched región while the compressed región continuesto propágate to the right. I n 4.4d 
the left end of the spring is returned to its initial position while the compressed and stretched 
regions continué to move to the right. 

■ Torsional The direction of motion that creates the wave rotates about the axis of wave motion. 
Putting a médium under twisting stress creates torsional waves. Figure 7.9 shows a Shive wave 
machine, which is an example of torsional wave propagation. Torsional wave motion proceeds 
down the central wi re that connects all the machi ne's transverse bars. 


Crest !-► 




Physical Basisof Sound 


117 


a) No forcé applied 



Stretched area -► Compressed area -► 


d) R eturned to startl ng posítlon 

Stretched area -► Compressed area -► 

Figure 4.4 

Longitudinal waves. 


i. m: q 


Standard 
atmospheric Peak I 


Figure 4.5 

Wave amplitude. 

4.21 Amplitude and Pressure 

The amplitude of a wave is the distancefrom its peak heightto its point of zero displacement or 
equilibrium (figure 4.5). For a sound wave, amplitude is thedifference between the wave’s peak 
pressure and standard atmospheric pressure. Sound amplitude is usually measured as sound pres¬ 
sure level (SPL), which is the difference between the greatest pressure in a wave and standard 
atmospheric pressure. 

If you divedeeply into aswimming pool.you experi ence pressure againstyoureardrums. Pres¬ 
sure i s the forcé applied by the molecules of water pressing perpendicularly on the surface area of 



118 


C hapter 4 


your eardrum. Physicists would say that the forcé of the water is applied normal to the surface of 
youreardrum. A normal forcé F i s distinguí shed from otherforces by addi ng the Symbol Xas a 
subscript. So, in general, pressurep is the amountof forcé applied normal to asurfaceF^divided 
by theareaa overwhich itis applied: 

p = y . Pressure (4.33) 

Pressure is measured in newtons per square meter. The SI unit of pressure is the pascal (Pa), 
named after the F rene h sci enti st/mathemati ci an BI ai se Pascal (1623-1662). So 1 Pa = 1 N/m 2 . 


4.22 I ntensity 


E nergy from the moti on of sound waves fl ows through the eardrums and i nto the i nner ear, where it reg- 
¡ sters assound. Intensity / istheenergy E per unitof timetthatisflowing acrossasu rf ace of u n i t area a: 


(4.34) 


Accordingto equation (4.32),P =E/t. So we can say/ =P/a 2 , thatis, intensity/ isthepowerflowing 
across the surface of area a. The standard area unit is 1 m 2 , so ¡ntensity i s measured in W/m 2 . We must 
al so take ¡nto account the direction the energy is flowing relative to the surface it is flowing across: 


l ± 


P_ 
m 2 ' 


(4.35) 


where l L is ¡ntensity flowing normal to the surface. 


4.23 Inverse Square Law 

I n the absence of any barri ers, sound has a spheri cal radi ati on pattern. To compare sound i ntensi ti es 
atvarying distancesfrom a source (figure 4.6), we must make spheri cal measurements. 



Figure 4.6 

Comparison of spheri cal surfaces. 





Physical Basisof Sound 


119 


Let P bethe power of a wave at distance r x propagadng along direction V. This means that an 
amount of energy P isflowing through surfacea! each second. If no energy is lost, then thesame 
power will flow through a 2 each second as well, and Pla 1 = Pla 2 - Sincetheareasof surfacesa^nd 
a 2 are proportional to the squares of their distances from the sourceS, the intensity / varíes 
i nversely as the square of the distance to the source, and 


/ 

/ 


(4.36) 


4.24 M easuring Sound I ntensity 


Justas the rangeof frequencies wecan hear is limited, so isour perception of sound ¡ntensity. The 
threshol d of heari ng i s the mi ni mum amount of sound i ntensity requi red for a si nusoi d to be detected 
by an average I i stener i n a noiseless environment. The limit of heari ng (also called thethreshold of 
pain) is the intensity above which sound is registered as (possibly damaging) pain by mostof us. 

Perception of loudness is not as straightforward as perception of pitch. While loudness is pri- 
marily affected by ¡ntensity, it is affected also by other perceptual and acoustic factors, especially 
frequency. We are generalIy less sensitiveto very low and very high frequencies (see section 6.5). 
For the ear's most sensitive frequencies, around 1000 H z, the range between the threshold of hear- 
i ng and the limit of heari ng is staggeri ngly large: 

■ Sound ¡ntensity at the threshold ofhearing at a frequency of 1 kHz is approximately t h = 
1 ■ 10 -12 W/m 2 for a very sensitive I i stener. 

■ Sound ¡ntensity at the limit of hearing at a frequency of 1 kHz is approximately l h = 
1 ■ 10° W/m 2 . 

Thus, the rangeof sound intensities our ears can register at 1 kHz is on theorder of 10 12 , which 
isabout 1 trillion to 1. No other sensefaculty has this rangeof sensitivity. 

4.24.1 Sound I ntensity Scale 

To establish a sound ¡ntensity scale, we must determine its boundaries and the gradations into 
which it is divided. It makes senseto use the threshold of hearing and the limit of hearing as the 
lower and upper boundaries of the scale. Expressing the range of hearing as a ratio shows the 
extent of the scale: 


Ml = 10 1! W/m2. 

{ h 10 - : 12 


Intensity Rangeof Hearing (4.37) 


But it is awkward to talk meaningfully about ratios with a range of 1 trillion to 1. It would be 
easier if we could measure sound intensities using asmall setof valúes that could bemapped 
to the wide range of ¡ntensities. This is similar to theapproach I took to represent pitch, where 



120 


C hapter 4 


thesemitones of theequal-tempered scale provided a simple mapping between linear pitch and 
exponential frequency. 

Using theexponentof thepowers of 10 as theunits of thesound intensity scale would allow us 
to represent the enormous dynamic range of perceived sound intensity simply with the numbers 
0 to 12. 

Wecanusethelog functiontoextract theexponentof aquantity. Forinstance, log 10 10“ 12 =—12. 
Sowecanextracttheexponentsin (4.37) by writing 

loQior = logip 10 ° w/m2 = 12bel. TheBeIScale (4.38) 

k 10" 12 W/m 2 

Thesound intensity scale developed thisway iscalled the bel, invented by engineersat Bell Tele- 
phone Laboratories in the 1920s and named in honor of AlexanderGraham Bell (1847-1922). 

Because we are measuring log ratios, the size of the bel ¡ncreases with increasing differences 
in intensity. For example, if / =10 W/m 2 and /' =100 W/m 2 , 

logioy = logio^ = lbel. 

If / = 10 W/m 2 and /' =1000 W/m 2 , 

logioy = logi 0 ^ = 2 bel. 

If / =10 W/m 2 and / ' =10,000 W/m 2 , 

logioj = l°gio^^ = 3 bel.... 

The bel scal e covers the enti re audi bl e range of sound intensities with just a dozen i nteger valúes, 
so wehavesatisfied an important design criterion. In fact, wehavesatisfied ita littletoo well: the 
range 0-12 is too coarse-grained for practical work. Thepreferred unitis the decibel (dB), which 
is ten times the resolution of a bel: 

10 log 10 r = 10 log 10 10 ° w/r ^ 2 = 120dB. The Decibel Scale (4.39) 

f h 10 -12 W/m 2 

Perhaps you've heard that the i ntensi ty range of heari ngisl20dB.Nowyouknoww here that num- 
ber carne from. 

Wecan generalize the decibel scale to compare two arbitrary intensities. Thesound intensity 
level in decibels of a sound with intensity /' is defined as 

10 log 10 j, 


The Decibel (4.40) 



Physical Basisof Sound 


121 


forsomereferenceintensity I. Wecan use (4.40) to makeacomparison of two reí ati ve i ntensi ti es. 
F or i nstance, if / = 10- 2 W/m 2 and /' =10- 1 W/m 2 , then the i ntensi tyof/' is greater than the ref- 
erence/ by 10 log 10 10 1 (- 2 ) dB = 10 dB. 

Wecan use(4.40) to makealoudnesscomparison againsteither or/ rt , depending upon whether 
we are measuring up from silence or down from the limit of pain. Sound intensity meters com- 
monly measure up from silence by setting / = t h in equation (4.40). A very quiet room might have 
a sound intensity level of about 40 dB. Continuous exposure to a sound intensity level of over 
90 dB can beharmfulto hearing. So the useful rangeof musical intensities isfrom about 45 dB 
to 95 dB, with peaks ranging upward to a máximum of 120 dB. 

W hy was 10 chosen as the base? W hy not pick 2 as the base, as we did for reí ati ng pitch to fre- 
quency (seeequation (2.1))? Onereason isthatthereis no obvious loudness equivalentto the Ínter- 
val of the octave. We don't always interpret a doubling of i ntensity as twice as loud. A Iso, powers 
of 10 give a much more compact scale to work with than would powers of 2. 

Toobtainan absolute intensity level fromadecibel level requires reversing the previous process: 

1. Convert decibels to bels. 

2. M ake the resulting valué an exponent of 10. 

3. M ake it proportional to the reference agai nst which it was origi nal ly measured, that is, t h or l h . 
Forexample, 

inUdBj/lO 

y w/m 2 = -■ Decibel-to-lntensityConversion (4.41) 

The decibel scale is used to measure sound level and ¡sal so used in sound recordi ng and Com¬ 
munications. A nother scale based on the same logarithmic principie isthe Richter scale, used to 
measure the intensity of earthquakes. Variants of the decibel are used to measure power, sound 
pressure, voltage, or intensity. 

4.24.2 Loudness in Recording Equipment 

M easurements of ambi ent sound, such as i n a concert hal I or factory, are typi cal ly measured up from 
thethreshold of hearing. In contrast, recording engineers usually want to measure down from the 
limit of the loudest sound they can record without distortion on their recording equipment. Let/ r be 
th e limit of recording, louder than which the recorder would distort. Now let / =l r be the reference 
loudness in (4.40). Then the loudness of the sound we want to compare is /'. The loudest sound we 
could record without distortion is/' =/ r . The decibel valuecorresponding to this is 10 log(/ r // r ) = 0 dB. 
For any softer sound, /' <l r , and the correspondí ng dB valuewill benegative. For this reason, the 
level meters on recorders are measured in negative dB valúes (figure 4.7). Approximate musical 
loudness levels are g¡ven for comparison. A valueof 0 dB on such a scale means the recorded sound 
is very cióse to saturad ng the recording médium. Negative dB valúes indícate softer level s. 

It is customary for the designers of recording equipment to leave 10 dB or so for head room at 
the top of the scale as insurance agai nst any sound's being distorted. So, typically the reference 



122 


C hapter 4 


-120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 O dB 


ppp pp p mp mf 


ff fff 


Figure 4.7 

Loudness levels, measuring down from the limit of hearing. 



Figure 4.8 

VU meter. 

intensity/ = / r - lOdB.Thisiswhy level meterson recorders show somepositivedB valúes above 
0 dB. The top positive valué is wheredistortion would actually begin. Any sound with loudness 
above0 dB ¡sin danger of being distorted. 

Recording equipmenttypically doesnot havethe samewidedynamic rangeas human hearing. 
Let the threshold of recording t r be the level below which the noise floor of the recorder's elec- 
tronics overshadows the recorded sound. Often, the practical limit is on the order of -90 to 
-100 dB. B el ow that the noi se fIoor of the recorder’s electroni es i s I ouder than the recorded sound. 
F i gure4.7 refI ects these consi derations. Figure4.8showsanexamplevolumeunit(VU) meter with 
ascalethatgoes to-60 dB. 

Becausethe decibel scale is logarithmic times 10, we know we can make a sound ten times 
louder by increasing its intensity by 10 dB. Wecan work out the decibel valúes corresponding to 
doubl i ng or hal ving the i ntensity as fol lows. We know from algebra that log (///') = logl - jogy (see 
appendix A). So, for instance, log(2/1) = Iog2 - logl. Now log2 « 0.3 and logl = 0, and 
10 • (0.3- 0) = 3 decibels per doubling of intensity. 5 Thus, if we double the ¡ntensity level of a 
sound, we raise its loudness by approximately 3 dB. 

4.24.3 Sound Pressure 

Unfortunately, measuring energy flow isnot al waysconvenient oreven possi ble. Intensity isonly 
meaningful when energy flows through an area. Yet there are occasions when energy is present 
but notflowing. Forexample, considerathick Steel pipetightly sealed atboth endswith Steel caps, 




Physical Basisof Sound 


123 


containing a battery-operated radio. Virtually all the energy from the radio sound reflects off the 
ends of the pi pe and vi rtual I y no sound escapes. T he resul ti ng waveforms i nsi de the pi pe are cal I ed 
standing waves. Since virtually all energy thatthe radio's loudspeaker pumps into the air in the 
pipe is returned to it by thecaptured air, virtually no energy flows, and there is virtually no mea- 
surable intensity. 

However,averagepressurevariationremainsmeaningful evenwith standing waves becausewecan 
still measure the pressure difference insidethe pipe between the pressure crests and troughs. Since it 
i s al so general I y easi er to build equipmentto measure pressure differences than it is to measure inten¬ 
sity, itwould be very convenient if wecould find a way to relate sound pressure to sound intensity. 

Relating Sound Pressure to Intensity The relation of intensity / of a sinusoid to the valué of 
the average pressure variation Ap in air is 



where V isthe velocity of sound in air and 8 is the density of air. The pressure variation Ap is one 
halfthepressuredifferencefrompeaktotroughof thewave(thatis, itisthedistancefromthemean 
to the extreme), measured in pascáis (N/m 2 ). 

Equation (4.42) says that intensity is proportional to thesquareof the pressure variation. This isthe 
most important part of this equation. I tal so says that increasing the velocity of sound in the médium 
or increasing the density of the médium would decrease the intensity. However, since we're al most 
alwaysdealing with air ator near standard atmospheric pressure, thesefactorscan beneglected. The 
most important Ítem isthe square relationship between pressure variation and intensity. 

Using standard valúesfor V and 8, if wetakethethreshold of hearing asi = 10“ 12 W/m 2 (inten¬ 
sity), then under normal atmospheric conditions, the average pressure variation correspondí ng to 
this intensity will be2 ■ 10~ 5 N/m. 

So wecan relate average pressure variation to intensity. Buthow can weactually measure the 
average pressure variation of a sound? There are a number of approaches we can take. 

M easuring Sound Pressure We could measure air partióle displacement, that is, the distance 
air molecules are pushed aside, but that is difficult to measure because of the random thermal 
motion of gas molecules, and we would have to be ableto resolveto a distance of 1 micrometer 
or I ess j ust to observe them. We coul d measure air partióle velocity, but that has the same probl em. 

By far the easiest way to measure the strength of a sound is to measure the average pressure 
changeoveralargearea. S i nce pressure i s forcé per unitof area, all we have todo is samplea large 
enoughareatoget a forcé wecan registeroneven relatively crudemeasuringdevices. So we define 
sound pressure level (SPL) as the average pressure variation per unit area. 

Standard atmospheric pressure/t is 10 5 Pa, roughly 14.7 pounds per square inch. Pressure level 
changes caused by sound waves are very small in comparison, ranging from about 0.1 Pa at the 
threshold of hearing up to 1 Pa at the limit of hearing. In comparison to atmospheric pressure, 
sound pressure is minuscule. 



124 


C hapter 4 


Vacuum i ■ 


High 

N pressure 
Fulcrum point 

y' Restraining spring 


b) M icrophone 

Diaphragm respondsto 
Electrical motion detector \a¡r pressurechanges 


F lexible diaphragm\ 



► Diaphragm vibrates in and out 
/ Small hole normalizesthe 
insi de to ambient pressure 


Figure 4.9 

Simplified barometer and microphone. 


The M icrophone Theeasiest way to measuresound pressure level iswith a microphone. Like 
barometers, microphones measure ai r pressure variation. W hereas a barometer endoses a vacuum 
and can therefore measure absolute pressure changes, a microphone endoses a small volume of 
ai r at the prevai I i ng atmospheri c pressure I evel and records the reí ative pressure changes of passi ng 
sound waves. A small flexible diaphragm on theend of the microphone enclosure is displaced by 
high and low pressure wavefronts impinging on itfrom a passing sound wave. Theamountof dis- 
placement can be measured electrically by a number of different techniques. Figure4.9a shows a 
simplified barometer, and figure 4.9b shows a simplified microphone. 

Given how vanishingly small the pressurefluctuations of sound are, how can a reíatively mas- 
sive microphone diaphragm be displaced by such tiny forces? The answer is, we make the dia- 
phram large so that it encounters more of the sound's forcé field. But this poses other design 
problems. If we make it too large, the diaphram becomes heavy and unresponsive. So we make 
it as thin as possi ble to reduce its mass. Butthen it becomes fragüe. So wechoosemateriais that 
are strong but flexible. This is the domain of mechanical engineering. 

4.24.4 Proximity Effect 

There’sanotherimportantand practical microphonedesign problemthatarisesfromthegeometry 
of waveforms. Waves expand from a sound source spherical ly under normal conditions; what hap- 
pens when a curved wave front encounters a fíat microphone diaphragm? 

If the diaphragm isnearthesound source, andthewavelength issufficiently short (and therefore 
thefrequency is high), the diameter of the diaphragm is large in comparison to thediameter of the 
spherically expandí ng wave. As shown in figure 4.10a, thediaphragmcutsacrossseveral high and 
low pressure areas. The high pressure areas cancel the low pressure areasfor a diaphragm in this 
position, so it receives little net energy. 

The diaphragm i n figure 4.10b is relatively far from the source and does not cut across pressure 
areas. I texperienees uniform high and low pressure across its whole surface at this frequeney and 
distance, thereby receiving máximum net energy from the passing wave. 




Physical Basisof Sound 


125 



Figure 4.10 

Proximity effect. 

At a sufficient distance from the source, sound waves tend to flatten out into sheets (plañe 
waves). The farther the receiver is from the source, for a same-sized receiver, the more plañe the 
wavefrontbecomes. The región near the source where the curvature of the wave frontis si gnificant 
to the receiver is the near field, and beyond where the curvature stops being significant is the far 
field (seesection 7.6). 

The significance of this phenomenon, the proximity effect, isthat as the distance from micro- 
phoneto spherically radiating sound source decreases, thehighfrequency sensitivity of themicro- 
phone decreases. As a speaker moves closer to a microphone, the timbre of the speaker’s voice 
becomes warmer.That's the proximity effect inaction. The proximity effectcan be reduced, if nec- 
essary, by choosing a microphone with a smaller diaphragm. It can al so befixed by moving the 
microphone farther away, but this can cause itto start picking up undesirableroom noise. This is 
the domain of audio engineering. 

4.25 Summary 

The Systéme International d'Unités (SI) is used to represent basic physical proportions, matter, 
distance, dimensión, and time. 

Periodi ci ty i s the amount of ti me elapsed between the start and end of a si ngle event. F requency 
isthe number of eventsoccurring in a single elapsed time interval. Periodicity and frequency are 
each other’s inverses. 

M assisthequantity of matter contai ned in an object. M atterisanything thatoccupies spaceand 
exhibits inertia. I nertia isthetendency of a bodyto impedeacceleration. M assisan intrinsic quality 
of matter, unchanged by such things as location. Weight, the forcé of gravity acting on an object, 
depends upon the object's location. 

Density measures how tightly packed togetherthe material in a body is. Wecan distinguish lin¬ 
ear distance, area distance, and volumetric distance. 

Displacement indicatesdistancefrom astartingpoi ntororigin.The ratioof distance to time isaver- 
age speed.Velocity indicates distance per time interval aswell as linear direction. Instantaneous 



126 


C hapter 4 


velocity is the ratio of two small but nonzero valúes of distance and time. Wetake the limit of the 
ratio as the time interval goes toward zero. 

Average acceleration is change in velocity measured in meters per second per second. Instan- 
taneousacceleration is the ratio of two small but nonzero valúes of velocity and time. Wetake the 
I i mi t of the rati o as the ti me i nterval goes toward zero. W hen pl otted, the i nstantaneous accel erati on 
at a point is just the amount that the curve bends at that particular point. 

If we know three of the four motion variables— displacement, velocity, acceleration, and 
time—we can always find thefourth by algebraic substitution. 

N ewton’s fi rst I aw of motion States that, because of i nertia, an obj ect conti núes i n a State of rest 
or motion at a constant speed along a straight line unless compelled to change that State by a net 
forcé. The mass of an objectisa quantitative measure of ¡nertia. Newton’s second law of motion 
equates forcé to mass ti mes acceleration. Newton'sthird law of motion States that forevery action, 
there is an equal but opposite reaction. 

Forcé is an action in a particular direction upon an object. Contact forces inelude friction, air 
resistance, turning (as in a screw), pressure, normal forcé, buoyaney, and tensión. Noncontact 
forces inelude gravitational, electrical, and magnetic forces. 

Work isthe forcé applied to move an object ti mes the distance itis moved. Energy istheability 
to do work. W hen a forcé performs work on an object, the result is a change i n the kinetic energy 
of the object. K inetic energy of an object is proportional to the square of the object's velocity. 

Objects may possess potential energy by virtue of position. Forms of potential energy inelude 
gravitational, elastic, and tensile energy. 

External forces can increaseor decrease the available energy in a System. They inelude applied 
forcé, normal forcé, tensi onal forcé, frictional forcé, and air resistance. Internal forces cannot 
change the total energy of a System, but they can change kinetic energy into potential energy, and 
vice versa. Internal forces inelude gravitational forcé, magnetic forcé, electrical forcé, tensile 
forcé, and spring forcé. 

The total mechanical energy of a System is the sum of its potential and kinetic energy. When 
work is performed on an object only by internal forces, the total mechanical energy of the System 
is unchanged but the energy changes form. 

Conservad ve forces store energy that can be retrieved later. Nonconservative forces dissipate 
or transmit energy. Nonconservative forces inelude frictional forces, viscous forces such as air 
resistance, and propulsive forces. N o net work is done by a conservative forcé on a closed loop. 
Only kinetic energy can be associated with nonconservative forces. For nonconservative forces, 
the work done depends on the path the motion takes. Conservative and nonconservative forces 
combine in everyday situations to produce a net forcé. 

Power is work done per unitof time. Thus, italso measures the rateatwhich energy is transí erred 
or transformed. K inetic energy tends to dissipate from where there is more to where there is less. 
To perpetúate a vibrad on requires constantly replenishing lost energy. There are sustaining and 
nonsustaining musical instruments, differentiated by whether there is a constantly renewable 
source of energy to drive the instrument's vibration. 



Physical Basisof Sound 


127 


Efficiency istheratio of useful power outputto total power input. 

Waves propágate in a médium by displacíng differences in forcé or pressurefrom one placeta 
another. The movementcan be transverse, longitudinal, or torsional. 

The amplitude of a wave is the distance from its peak height to its point of zero displacement 
or equilibrium. For a sound wave, amplitude is the difference between the wave’s peak pressure 
and standard atmospheric pressure. Sound amplitude is usually measured as sound pressure level, 
which is the difference between the greatest pressure in a wave and standard atmospheric pressure. 
Pressure is the amount of forcé applied normal to a surface divided by the area over which it 
isapplied. 

Energy from the motion of sound waves flows through the eardrums and into the inner ear, 
where it registers as sound. Intensity istheenergy per unitof ti me (power) that isflowing across 
a surface of unit area. Sound has a spherical radiation pattern if not blocked. Intensity varíes 
inversely as the square of the distance from the source. 

The ear detects sound intensity between the threshold of hearing and the limit of hearing. The 
decibel is 10 times the logarithm base 10 of the ratio of two intensities. The decibel scale covers 
the entire audible rangeof sound intensities with 120 valúes. 

M easurementsof ambient sound, such asinaconcert hall orfactory, aretypically measured up 
from the threshold of hearing. In contrast, recording engineers usually measure down from the 
limit of the loudest sound they can record without distortion. 

Sound intensity is typicallyless useful as a measurethan sound pressure becausethereare pres¬ 
sure differences even in standi ng waves that we can measure. A Ithough we could measure particle 
displacement or velocity, it's easier to measure average pressure variation per unit area, called 
sound pressure level (SPL). A microphone measures relative pressure variation from onesideof 
adiaphragmto theother. 




Geometrical Basis of Sound 


Geometry isfrozen music. 

-Goethe 

5.1 C ircular M otion and Simple H armonic M otion 

Suppose a pendulum swings back and forth above a turntable, The turntable has a marker, such as a 
small cone, placed on its surface (figure 5.1). The cone moves with uniform circular motion because 
a motor dri ves ¡tina circle ata constant speed. N ow adj ust the I ength of the pendul um so that i t makes 
one f u 11 sw i ng i n the same ti me that the tu rntabl e makes one compl ete revol uti on, and reí ease the pen¬ 
dul um at exactly the same moment the cone moves under it so that the two movements are synchro- 
nized. With the two motionsso aligned, if we look directly edge-on at the turntable, the pendulum and 
the cone seem to have exactly the same left/right motion even though we know that the pendulum 
moves in a line whiletheturntable movescircularly. I ntuitively, itlooks likecircular motion and simple 
harmonio motion are insomeway equivalentifseen fromthe right vantagepoint. This train of thought 
suggests that we can use the geometry of áreles to study simple harmonio motion and wave behavior. 

5.2 Rotational Motion 

Circular motion and simple harmonio motion are closely related. In fact, to understand circular 
motion isto understand sinewaves, which are the basis ofall musical sound. Thissection reviews 
information provided by geometry and trigonometry about circular motion. 

5.2.1 Angular Displacement 

The center of a rigi d rotati ng body, such as a tu rntabl e, defi nes its axis ofrotation as a poi nt around 
which circular motion revolves. Theanglethrough which the rigid body rotates about its axis of 
rotati on is its angular displacement. Suppose a turntable rotates from an i ni ti al angle 0 O to a final 
angle of 0 f . We say the turntable sweeps out the angle 0, defined as 


) f -0 o . 


Angular Displacement (5.1) 



130 


C hapter 5 



Figure 5.1 

Pendulum andtumtable. 



Figure 5.2 

Radian measure. 


Rotatable objects can turn either clockwise or counterclockwise. 

Counterclockwiseangulardisplacementistakentobepositive,andclockwiseangulardisplacement 
is taken to be negative. 

Thus, 0 indicates counterclockwise rotation, and -0 indicates clockwise rotation. 

5.2.2 Radians 


11 is common to use degrees to measure angular displ acement or to refer to enti re revol uti ons. O ne 
revolution returns aturntableto itsinitial position and equals360°. 

Suppose a turntable sweeps out an angle 0 as shown in figure 5.2. As it does so, point Q traces 
outan are of length s. Clearly, the length of s grows if either its radiusr ortheangle0 grows. In 
fact, we can show with elementary geometry that 


Are length _ s 
Radius r‘ 


(5.2) 


When s/r = 1, that is, when the are length is the same as the radius, the angle0 isequal to 1 radian 
(rad). Since both s and r are measures of distance, their ratio is a dimensionless number (because 



Geometrical Basisof Sound 


131 



Figure 5.3 

Angular rotation. 


the units in the numerator and denominator cancel out). A dimensionless number is a "puré num- 
ber" unencumbered with physical significance. 

If thepointQ sweepsoutoneentirerevolution of radiusr, itsangulardisplacementwill be0 = 2it 
and its are lengths will equal thecircumferenceof thecircle, 2nr. 

Since one revolution equals 360°, we can equate degrees and radians. If 0 = 360°, then s = 2 nr 
and 

§ = M = 2n rad, (5.3) 

and 2n rad = 360°. Solving for rad, we see that one radian is 

rad = ^ = 57.3°. Radian (5.4) 

2 n 

This constant, the radian measure, allows usto usesimple i ntegers and ratios of i ntegers to specify 
useful divisions of a cirelé. 1 For example, the circumference of the circle is 2n radians, and a half 
circle (180°) is one half of that, exactly n radians. Si mi larly, one quarter of the circumference is 
n/2 radians, which is therefore 90°, the size of a right angle. 

When angles arestated in radians, the constant jt tendsto drop outfrom equations, greatly sim- 
plifying calculations. Radian measure also simplifies calculation of the length of an are. Solving 
(5.2) forsyields 

s = r0, Length ofan Are (5.5) 

sowecangetthelength of ssimply by multiplyingthe radiusof itscircleby the arc'sanglein radians. 2 

5.2.3 Angular Velocity 

Suppose a turntable starts at angle 0 O and rotates to angle 0 f (figure 5.3). Then its angular displace- 
mentis0=0 f - 0 O . Further, supposetheturntableperformsthisrotation in tseconds.Then itsangiv/ar 
velocity is the angular displacement 0 divided by elapsed time t: 

(o=^, Angular Velocity (5.6) 


which we measure in SI units of rad/s. 



132 


C hapter 5 


A ngular velocity is the rate at which angular displacementchanges. 

Compare (5.6) to linear velocity, which is the rate at which linear displacement changes. 
Counterclockwise angular velocities are positive, whereas clockwise angular velocities are neg- 
ative. In (5.6) the Symbol = means "defined as." I use itto signify that I am defining coto have a 
particular meaning, namely, 0/t. Later, when I useco, itwill carry this significance. 

Here'sanotherway to calcúlate angular displacement. Supposetheturntableshowninf¡gure5.7 
i s set so thatthecone is at its rightmost position, aligned with thex-axis.Then westarttheturntable 
and start a ti mer at ti me t = 0. The turntable rotates counterclockwise at a constant rate of co rad/s, 
moving through angle 0 in time t. Since the turntable rotates at a uniform speed, the sizeof the 
angle 0 grows at a constant rate. Therefore, the angular displacement 0 at time t is the angular 
velocity times the elapsed time t: 

0 = co t. Angular Displacement with elapsed time (5.7) 

5.2.4 A ngular A cceleration 

If the turntable shown i n figure 5.7 starts rotating with angular velocity co 0 and ends at time t with 
angular velocity co f , the change in angular velocity is co = co f - co 0 . If the change is not zero, the 
turntable exhibits angu/ar acceleration a, which is change in angular velocity co through time t: 


« = £> _ 0/t _ 0 Angular Acceleration (5.8) 

t t t 2 

measured in SI units of (rad/s)/s = rad/s 2 . 

A ngular acceleration ¡s tile rate at which angular velocity changes. 

5.2.5 Rotational Speed 

If a bicycle's wheel is turning once per second at a constant rate, and the tire’s radius r = 0.5 m, how 
fast i s the bi cycl e goi ng? I f the ci rcumference of the w heel i s c = 2nr = 3.14 m, then the vel oci ty of the 
bi cycl e must be about 3.14 m/s. E very poi nt on the ci rcumference of the ti re i s al so travel i ng at 3.14 m/s. 
Thus, for some radius r and some period of time T, 3 the rotational speed of a point on a circle is 

v = Rotational Speed (5.9) 


5.2.6 Centripetal Acceleration 

Speed doesn’timply direction, but velocity does. Asa point on the circle travel s, itsdirection 
changes moment by moment. So, even though the speed of a poi nt on the ci relé remai ns uniform, 
its velocity changes from instant to instant because itsdirection changes. 

F igure 5.4a shows a ci rcl e rotati ng through poi nts p 1 and p 2 . The vel oci ty at these poi nts can be 
draw n as vectors, v x and v 2 , representi ng the I i near vel oci ty of each poi nt. T he di fference of the two 
vectors i s the change i n vel oci ty Av = v 2 - v v T he difference of two vectors can be show n by putting 



Geometrical Basisof Sound 


133 



their bases together and measuringthedistancebetween theirti ps (figure 5.4b). Si mi larly, the vec¬ 
tor distancebetween p 2 and p 2 isAr=r 2 -r 1 (figure 5.4c). Since the length of v 1 =i/ 2 andthelength 
°f r 1 = r 2 , triangle r/ 2 Ar and triangle v x v 2 Av are both isósceles triangles. 

Let'ssimplify thingsa bit. Since v 1 = v 2 , let'sdefine v = v x = v 2 , and since= r 2 , let's define 
r = r 1 -r 2 (figures 5.4d and 5.4e). N ote that the isósceles triangles in 5.4d and 5.4e have the same 
angle 0. So they are similar. From geometry we know thatfor similar triangles, 


(5.10) 


F or the next step, we can make a si mpl ifyi ng assumpti on. F i rst, I et A t-t 2 - 1 1( the ti me i t takes for 
p 1 togettop 2 . Now, forsmall anglesG, 

Ar~v-At. (5.11) 

Thatis, Ar is approximately equal tov-At forsmall anglesG. Properly speaking, the length weshould 
uséis the are of the ci relé betweenp! and p 2 because that's the di stance the poi nt wi 11 actually betrav- 
eling. Butfor small angles, thedifferencebetween the length of the are from p x and p 2 and the length 
of the chord from p 1 and p 2 can beignored. Being ableto ignore thiswill greatly simplify whatfollows. 
Ifwesubsti tute (5.11) into (5.10), we derivetheaccel eration of thepointon thecircleasfollows: 



Av = \¿ 
A t r' 



134 


C hapter 5 



Figure 5.5 

Centripetal acceleration. 

The ratio Av/At is acceleration because it represents change in velocity over time. This is called 
centripetal acceleration because the direction of the bending forcé is always toward the center of 
the circle (seefigure 5.5). Itisdefined as 

a c = — , Centripetal Acceleration (5.12) 

where a c is centripetal acceleration, vis velocity, and t is time. 

Suppose we can control a rocket in deep space and want it to turn in a circle around a point with 
radiusr.Togetittoturn, wewould haveto igniteone rocketon itstail topropel itforward with a forcé 
proportional to v and igniteanotherpointingsideways with aforceproportional toa c . Figure5.5a shows 
that for v=50 and r = 125, a c must be 50 2 /125 = 20. F igure 5.5b shows that if v is doubled to 100 for 
the samer, then a c must quadruple to 80 in order for the rocket to maintain a circle of the same size. 

5.2.7 Tangential Velocity 

On a merry-go-round the circular motion pushes riders away from the center, and pushes them 
harder, the further from the center they are. But is the direction of the push radial, directly away 
from the center? Setting an objecton a turntable,wecanspinitatsomeangular velocity co sufficient 
to make itfly off. Suppose itflies off at point Q (figure 5.6). Wewould observe that the object's 
angular velocity isinstantly converted into some linear velocity in a direction tangentto the point 
where itflew off. 4 This is understandable because 

Circular motion is linear velocity forward constrained by centripetal forcé toward a center. 

If we suddenly elimínate the centripetal forcé, the remaining linear velocity is all that is left, 
and the objectf lies off in whatever direction itwas last aimed. In figure 5.6 the velocity of the 



Geometrical Basisof Sound 


135 



Figure 5.6 

Tangential speed. 

object at point Q is shown by a vector v T anchored on Q and drawn tangent to the circle. The 
vector v T indi cates the tangential velocity of the object at point Q correspondí ng to its linear 
velocity. 

Clearly, the object is subject to tangential velocity even when it is still on the turntable because 
this represents its I inear velocity at each moment i n ti me. Velocity impli es both speed and di rection, 
but the vector v T is constantly changing direction as it progresses around the circle. So the mag- 
nitudeof the vector isjustitslength (without regard to which direction itpoints) and corresponds 
to ¡ts speed. 

Intuitively, we can tell that the tangential speed v T must be related to the turntable’s angular 
velocity co = 0/tas well as to ¡ts radius r because an increase in either would tend to give more 
velocity to the object. But how can we express this? 

Recall that (5.5) relates angular displacement 0 and radius r to the are length s by s = re, 
and that (5.6) reí ates angular vel ocity to angular displ acement and time by co = 0/t. Ifweintroduce 
(5.6) into (5.5), the result combines angular velocity and radius, as we require. Dividing both 
sides of (5.5) by timet, weobtain 

f = j=r-| rad/s. (5.13) 

The right-hand si de now has a term 0/t in it. Since angular velocity co = 0/t, (5.13) can be 
rewritten as 

? = reo rad/s. 

Si ncesmeasu res are length, theratios/texpresses the speed of a point ontheci relé. Thus, tangen tial 
speed is defined as 

v T = ^ = reo rad/s. 


Tangential Speed (5.14) 



136 


C hapter 5 


We must use units of rad/s because this equation was derived from (5.5), which defines radian 
measure. When an objectisthrownoff aturntable, its tangential speed is converted into tangential 
velocity because then it has a particular direction, namely, tangent to its last point of contact. 

5.2.8 Period and Frequency 

A s the cone on the turntable i n fi gure 5.1 completes one revol uti on, the correspondí ng si mpl e har- 
moni c moti on compl etes one back-and-forth cycl e. T he peri od T of thi s cycl e el earl y depends upon 
the angular velocity coof the ci relé. Since by (5.6), co = 0/t, and the ci relé completes one revol uti on 
of 0 = 2n radians in t=T seconds, we can relate angular velocity to period T as follows: 

0 2 n 

t = T' 

and so 

T = — Period related to angular velocity (5.15) 

co 

Since frequency f= 1/T, wecan relate the angular velocity to frequency: 
co = 2nj. = 2nf . 

Relating angular velocity to frequency in thisway will beso useful in subsequentchaptersthat it 
deserves being repeated: 

co = 2jtf. Radian Velocity (5.16) 

In this book, when I writeco, I will almostalways mean its definition 2nf. Solving (5.16) for f, 
we derive the definition of frequency: 

f = ^ . Frequency related to angular velocity (5.17) 

¿n 

Thisdefinitionsaysthatfrequency istheratiooftheangularvelocity, co = 0/t(seeequation (5.6)), 
to the are I ength of a ci rcl e. T he greater the angul ar vel ocity, the more often i t compl etes a ful I ci relé, 
henee the higher its frequency. 

5.3 Projection of C ireular M otion 

Figure 5.7 shows a spring/mass System vibrating vertically next to a turntable that has a cone 
mounted on its edge. By appropriate choices of rotational speed of the turntable, elasticity of the 
spring, and weight of the mass, the motion of theshadows of the cone and mass can besynchro- 
nized on ascreen behind them. This suggests that the simple harmonic motion of a pendulum or 
a weighted spring can be related to uniform circular motion via projection. 



Geometrical Basisof Sound 


137 



Figure 5.7 

Simple harmonio motion asthe projection of uniform circular motion. 







Z'' 


Q Projected 

J. 



* light 

y \o° x-axis 




v Position at 

J t = 0 

^ Harmonic 

Circular 


Figure 5.8 

Front view of turntable. 


Figure 5.8 shows the turntable and screen from figure 5.7 with the coneat pointQ. Since light 
shines across the circle parallel to thex-axis, point Q', which istheshadow of Q, appears on the 
screen at the same displacement above thex-axis. 

ThedisplacementofpointsQ and Q' from thex-axis isy, the projection of the radiusA ontothey-axis. 
Elementary trigonometry shows that the radiusA, its angle 0, and the valué of y areconnected 
by the sine relation (see appendix A). 

y=Asin9 = A ■ jp SineRelation (5.18) 

Equation (5.18) relates the heighty of the triangle, and henee the heightof its projection on the 
screen, to the radi us A and i ts angl e 9. T he si ne reí ati on al I ows us to reconci I e ci reul ar moti on w i th 



















138 


C hapter 5 



Figure 5.9 

Simple harmonio, uniform circular, and sinusoidal motion. 

simpleharmonic motion. In orderto seehow the vertical displacementy changes, figure5.9 adds 
a stri p of f i I m to record the posi ti on of the mass and cone through ti me, al I ow i ng us to see the si nu- 
soidal motion of the spring/mass system togetherwith the motion of the turntable. M athematically 
and intuitively, it should be clear now that 

Simple harmonic motion and the projection of uniform circular motion are the same. 

5.3.1 Relating Displacement of Simple Harmonic Motion toTime 

Since, by (5.6), 0 = cot, we can relate the vertical displacementy of the cone's shadow at ti me tas 
follows: 

y = A sin 0 = >A sin cot, (5.19) 

where/\ is the radius of the turntable, 0 is the turntable's angular displacement, tis time, and cd is 
angular velocity. 

Theexpression cot in (5.19) determines the rotational position of the turntable at time t; taking 
the si neof that rotati onal position determines the heightof the vertical displacementy; multiplying 
the vertical displacement by A scales the displacement for the size of the turntable. 

Equation (5.19) shows the identity of simple harmonic motion and circular motion and provides 
a way to determine the displacement of a sinusoidal wave at any time t. We see that 

The projection of simple harmonic motion through time generates sinusoidal motion. 

Theterm/l in (5.19) can be interpreted either as the radius of ádrele oras the amplitudeof the 
corresponding simple harmonic motion because this valué determines both attributes. 



















Geometrical Basisof Sound 


139 



00000009 


Figure 5.10 

Constructing a sine wave. 

5.4 Constructing a Sinusoid 

A si mple waytogeneratea sine wave istoplot afew sel ected points of (5.18) and connectthepoints 
with asmooth line. Figure 5.10 shows eight valúes of A sino every 45° as 0 makesone complete 
revolution. They-axis shows the correspondíng valúes of A • (y/A) for radius A = 1. A cirelé of 
radiusl isa uniteirele. 11 i s conveni ent to set A to 1 ¡norderto keep the example simple, butitcan 
be any valué. Noticethaty takes on valúes i n the range-1 to 1 as 0 vari es. 

■ When theangle0 isO° or 180°, thedisplacementofy = 0 and sin 0 o = sin 180° =y/A =0/1 = 0. 

■ When 0 is 90°,y = 1 and sin 90° = y/A = 1/1 = 1. 

■ When 0 is 270°,y = —1 and sin 270°= y/A =-1/1 = -l. 

These cardinal points are marked with diamonds in figure 5.10. 

■ At45 0 ,triangleAxyinfigure5.8becomesanisoscelesrighttriangle,andbyelementarygeometry, 

y=A = ^_ . 

y 72 1-414... 

Plugging this valué into the sine relation yields the formula 
si n ^ = sin AA. = s¡nl = sin 45°= 0.707. ... 

4 4 72 





C hapter 5 


140 



Figure 5.11 

A natomy of a sine wave. 

■ Similar reasoning establishes the valúes at 135°, 225°, and 315° (figure 5.10). 

So as 0 goesfrom 0 to 360°, sin 0 exhibits one period of sinusoidal motion. 

5.4.1 Anatomyof aSinusoid 

T he I and marks of the si nusoi dal wave are showninfigure5.11.They-axisshowstheamplitude,which 
is proportional to a correspondíng circular radiusA. Thex-axis shows the phases of thesinewave's 
variousnotablefeatures, such as whereitcrosses thex-axis (zerocrossings), and its crests and troughs. 
We speak of the "phases of the moon" in the same sense: phase describes the characteristic points 
reached periodically each time a wave repeats. The period, or cycle, of a sine wave is one complete 
movement through al I its phases, correspondí ng to one compl ete revol ution of a correspondí ng ci rcle. 

It will often be moreconvenientto show the passage of time on thex-axis ratherthan thesize 
of the angle 0. Solving (5.7) fortyields 

t = £, (5.20) 

(O 

which shows that time is directly proportional to angular displacement 0. This means thex-axis 
can either measure elapsed time or elapsed phase. 

Since frequency is the reciprocal of time, f=l/t, (5.20) can be rearranged: 

f = 2, (5.21) 

which shows that frequency is directly proportional to angular velocity ca Thegreater the angular 
velocity, the more rapidly the turntable turns. 

If thex-axis shows elapsed ti me, wearemeasuring frequency; if thex-axis shows elapsed phase, 
we are measuring periodicity. 

Sinusoids, likecircles, haveno beginning and no end, so the peri od of a si ne wave can start any- 
where. Conventionally, sine wave periods are usual ly regarded as beginning ata positive-going 



Geometrical Basisof Sound 


141 


zero Crossing (figure 5.11) and extending until justbeforethenext positive-going zero Crossing. 
B ut we could j ust as welI measure the period from crest to crest, or from trough to trough, by suit- 
able choice of phase offset. 

5.4.2 Phase Offset 

Equation (5.19) requiresthattheturntablestartatits0 o position, which iswhen pointQ infigure5.8 
is aligned with its positivex-axis. In this position, the vertical displacementy = 0 becausesin 0 = 0 . 
If we wish to be able to start the turntable at any orientation, we must introduce a way to specify 
its starting position in (5.19). If wedon’t start with 0 = 0, y will havea nonzero initial valué. 

Let's define a constant 4 », which is the phase angle (or phase offset or phase shift) of the turn¬ 
tabl e's starting position. The vertical displacementof the cone'sshadow attimetwith phase offset <f> 
can then be written as 


y = /\sin(cot + <|>), (5.22) 

where <|> defines a constant offset from 0 o . It can take on any positive or negative real valué. For 
instance, supposewesetc|> = rc/2. N ote in figure 5.10 that sin (jt/2) = 1 . Then attimet = 0, 

y = 4sin(cot + W2) =/4 sin(jt/2) =A. 

This means that at t = 0 the turntable starts rotating with the cone positi oned at the top of the turn¬ 
table, which is rotated 90° counterclockwisefrom the previous starting position. 

5.4.3 Wavelength 

The physical length of a waveform period, its wavelength, depends upon the médium through 
which the waveistraveling and itsfrequency. Inair, sound wavestravel atabout340 m/s(approx- 
imately 1100 feet per second) at a temperature of 20°C (see section 7.4). 

So a frequency of 1 kHz in air has a wavelength of approximately 


1 second 
1000 periods 


340 meters 
1 second 


= 0.34 meters per period, 


or 


lsecond 1100 feet 


= 1.1 feet per period. 


1000 periods lsecond 
N ote how these three measurements are i nterrelated. 



isa measure of... 

is measured in ... 

unit 

Periodicity 

duration 

seconds/cyde 

seconds 

Frequency 

how rapid or how often 

cycles/second 

hertz 

Wavelength 

length 

meters/period 

meter 



142 


C hapter 5 


5.4.4 Velocity of Simple H armonio M otion 

How can wecharacterizethe velocity of an objectmoving in simple harmonio motion when both 
thedirectionandspeedofsuchanobjectchangethroughtimeastheobjectvibratesbackandforth? 
Since harmonio motion isthe projection of circular motion, weshould be ableto understand the 
velocity of harmonio motion by thinking moreabouttangential velocity. 

Figure 5.12 shows the projection of tangential velocity v T of an objecton theedgeof aturn- 
table. By a combination of geometry and trigonometry (see appendix A), we see that the 
velocity vof the shadow that is projected on the screen isjustthey-axiscomponent of the vector v T , 
that is, 

v = v 7 cos0, (5.23) 

where0 = cot. 

Recall from (5.14) that the tangential velocity v T isrelated to the angular velocity co by v 7 =rco. 
Let’s substitute amplitudeA for radius r, so now v 7 = Aco. Substituting Acó for v 7 in (5.23), we 
obtain the velocity of simple harmonio motion: 

v = Acocos0=Acocos cot. (5.24) 

This tells us that even though an objecton a rotating circle moves with uniform circular motion, 
the velocity of its corresponding simple harmonio motion is not uniform. The velocity constantly 
vari es between máximum and minimum valúes through time sinusoidally. W hen 0 equals exactly 
90° or 270°, velocity is exactly 0, and theobject in simple harmonio motion is momentarily sta- 
tionary. Velocity is positive máximum when 0 equals 0, and atthat pointit equals 

v = Aco. M aximum Velocity of Simple Harmonio Motion (5.25) 

Velocity is negative máximum when 0 equals 180°. 


I 




Figure 5.12 

Projection of tangential velocity. 



Geometrical Basisof Sound 


143 


5.5 EnergyofWaveforms 

Equation (5.25) says that the velocity of an object vibrating in simple harmonio motion ispropor- 
tional to both the amplitude and the angular velocity of the corresponding unit cirelé. In other 
words, simple harmonio motion—the projection of circular motion—will have higher velocity 
either if the corresponding circular motion has a longer radius or if that radius turns faster. This 
suggests that a mass moving in simple harmonio motion would havegreater momentum if either 
its amplitude or its frequeney were increased. 

I n secti on 4.14, ki netic energy E k was shown as the product of the mass m of an obj ect ti mes the 
squareof its velocity v, or E k = mv 2 . 1 used an automotivemetaphorto show that doubling acar's 
speed quadruples its energy. Now let’sapply this understanding toa mol eculeof airzipping in and 
out of someone's ear as part of a sound wave impinging on their eardrum. 

Ifthe amplitudeof awavedoubleswhilethefrequency remains the same, the parti el emust cover 
twicethe distancein the same amountofti me (viaoneperiod of doubled amplitude). Or, ifthe fre¬ 
queney of the wave doubles, the parti el e must cover twice the distance in the same amountoftime 
(viatwoperiodsattheoriginal amplitude). In either case, the energy ofthemoleculeof airhasqua- 
drupled because the velocity of its simple harmonic motion has doubled. 

If the wave in figure 5.13a is stretched out, it has the length shown in figure 5.13d. The wave 
in 5.13b, with twicethe amplitudeof the wave i n 5.13a, has the length shown i n 5.13e. Wave 5.13c 
has the same amplitude and twice the frequeney of wave 5.13a, and its length al so equals that 
shown in 5.13e. Since the duration T of all three waves (5.13a, 5.13b, and 5.13c) is the same, but 
the length ofwaves5.13band5.13cistwicethatofwave5.13a,clearlywaves5.13band5.13chave 
twice the speed of wave 5.13a. So we see that a wave's energy depends on both its ampl itude and 
its frequeney. 

Consider a poi nt on theturntable i n figure5.12.1 f the turntabl e's radius i s>A, i t has ci rcumference 
d = 2izA. Since, by equation (4.8), velocity is v = dlt, the cireular velocity of the point is v = 2 nA/t, 



d) Length of (a) e) Length of (b) and also the length of (c) 


Figure 5.13 

Path lengths. 



C hapter 5 


144 


which also can be written as v = 2nA ■ (1 lt). Since, by equation (4.1), thefrequency of rotation 
isf=l/t, wecan alsowrite 

v = 2nAf. Rotational Velocity (5.26) 

Taking E = mv 2 from equation (4.28) and substituting vfrom (5.26) yields 
E = m(2nAf) 2 , Rotational Energy (5.27) 

which confirms thatwave energy depends upon both frequency and amplitude. 

5.5.1 MeasuringtheEnergy of Waveforms 

Peak Pressure L evel Perhaps the easi est way to measure the strength of a waveform i s to exami ne 
how its maxima and mínima— its highest and lowest points— relate to the ambient pressure level. 
Peakpressure/eve/ofasoundwaveisthedifferencebetween theambient pressure level andthemag- 
nitude of either the máximum or mínimum pressure level of the sound wave, whichever is greater: 

/ p = max(|/ + |, |/_|) -l a , Peak Pressure Level (5.28) 

where/pis peak pressure level, / a is ambient pressure level,/ + ¡sthe highest peak, and/_ isthedeepest 
trough. The operator |.. ,| gives the magnitude of the enclosed expression, and the function 
max(a, b,. ..) chooses the greatest valué of its arguments. Figure 5.14 shows the peak pressure 
level. 

Peak-to-Peak PressureLevel Every sound recording device has some limit beyond which it 
can no longer accurately represent the strength of the waveform being recorded, and waveforms 
with peaks greater than the li mit are distorted (see section 4.24.2). M odern recorders often contai n 
volume level meters that measure the strength of the recorded waveform based on (5.28) to help 
the recordist prevent distortion. The peak-to-peak pressure level of a waveform is the magnitude 
of the distance between l + and /_: 

/ pp-| / + _/ -| Peak-to-Peak Pressure Level (5.29) 

Why Average Pressure Level Doesn't Work Peak-to-peak level shows the limits of a wave¬ 
form’ s ampl i tude, but i t does not al ways provi de the best i nformation about a waveform’s strength. 
For example, a recordi ng that is mostly sil ence except for a brief tone burst may have a large peak 

i + i pp 


Figure 5.14 

Peak pressure level. 




Geometrical Basisof Sound 


145 


amplitude if thetone burst is loud, butthere is little energy in the waveform over its total duration 
because it is mostly silent. 

One might try to get a clear picture of a waveform's strength by averagi ng the waveform’s pres- 
sure over time, hopi ng to smooth out the peaks. B ut sound waveforms are usual ly evenly balanced 
above and below ambient pressure, so in general /+-/_ = 0. Therefore, the mean valué of most 
sounds i s ty pi cal I y el ose to zero, and so average pressure i s not a usef ul way to measure the strength 
of a waveform. 

RMS Level Ideal ly, itwould beuseful to observe the power contai ned in the waveform because 
power is the energy in the waveform through time. But all we can easily measure with a micro- 
phone i s the waveform's pressure fl uctuati ons. H ow can we derive a measure of energy from pres¬ 
sure? The key lies in recalling that there is a square relation between amplitude and energy. 

The average val ue of eos t over one ful I period is 0.0. The peak ampl itude l p = t¡| § = |/_| of the 
cosine is 1.0. The peak-to-peak amplitude is l pp = 2.0 (figure 5.15). 

Let s(t) = eos t. There is a trigonometric identity (see volume2, appendix A.4.1) that says 

cosa eos b = cos(a-b) + cos(a + b) 

If we square s(t), then 

S 2 (t) = COS 2 t 

= costcost 

_ cos(t- f) + cos(f + 1) (5 30) 


= 1 + cos2t 
2 

So, by (5.30), s 2 (t) is a cosine wave at twice the frequeney, offset by 1, and then divided by 2 
(figure 5.16). This is what the original cosine waveform, shown in figure 5.15, looks likewhen 
squared. Note that all valúes are now positive. The peak valué isstill 1.0. Its mean valué is 0.5. 

Now let’stake the mean valueforthissquared waveform (0.5) andundotheeffectsofthesquaring 
operadon. The square root of the mean valué is 0.5 1/2 = 0.707. This is the rootmean squared 



Figure 5.15 

RMS level. 



146 


C hapter 5 


1.0 



Figure 5.16 

RM S cosine. 


(RM S) valueofthewaveform.SotheRM S amplitudeof s(t) = 0.707 .Thisallowsustosay some- 
thing useful abouttheaverageenergy of a sinusoid knowingonly its amplitude. The relation of the 
amplitudes isasfollows: 


Average 0 

RM S 0.71 

Peak 1 

Peak-to-peak 2 


In general, if s(t) is a sinusoid with peak ampl itude A, then its RM S ampl itude isA/72. 

Because we used a sinusoid to derive RM S ampl itude, this measure is only val id for sinusoids. 
In particular, it is not valid for time-varying waveforms. This, of course, leaves out all the inter- 
esting real-world audio waveforms we'd liketo measure with it. Nonetheless, this definidon of 
RMS iswidely used in practice 5 because, I suppose, it's better than nothing. But there are more 
sophisticated techniques to overeóme this difficulty and find the true RMS valué of arbitrary 
waveforms (seevolume2, chapter 1). 

Sound PressureLevel Although thedecibel scalewasdevelopedforsound intensity,wecanadapt 
itto measure sound pressurelevel. Equation (4.40) defined decibelsof sound intensity level (dB SIL) as 

ydB SIL = 10log 10 j , dB SIL (5.31) 

wher el is a reference intensity, and l'is the intensity being measured. Recalling that intensity is 
proportional to the square of amplitude, wecan define decibelsof sound pressurelevel (dB 5PL) as 

io-iog 10 (^) 2 
2 • 10 • log 10 j 
and 

ydBSPL =20log 10 j , dB SPL (5.32) 

whereA isa reference ampl itude, and A' is the amplitude being measured. 



Geometrical Basisof Sound 


147 


Decibelsof sound pressure level (SPL) correspond to twice the equivalent decibels of sound 
intensity level (SIL). W herea doubling of intensity corresponds to an increaseof 3 dB SIL, adou- 
bling of pressure corresponds to an ¡ncrease of 6 dB SPL. An ¡ntensity ratio of 10:1 equals 
10 dB SIL and 20 dB SPL. 

5.6 Summary 

U niform circular moti on is circular movement at a constant speed. Simple harmoni c moti on i s the 
projection of ci rcularmotion. Angular displacementis theanglethrough which a rigidbody rotates 
about itsaxisof rotation. Counterclockwiseangular displacement istaken to be positiveand clock- 
wise angular displacement to be negative. The angleformed by a radius and an are the length of 
the radius is called a radian. M easuring angleswith radianssimplifies many calculations. Angular 
velocity is the rateatwhich angular displacementchanges. Angular acceleration istherateatwhich 
angular velocity changes. 

By Newton’s laws, objeets tend to travel ¡na straight line. To travel in a circular path, an object 
must experience centrípeta! acceleration to overeóme the object's tendeney to travel linearly. 
Circular motion is linear velocity forward constrained by centrípeta! forcé toward acenter. There 
is no such thing as centrifuga! forcé. 

Simple harmonic motion of a pendulum or a weighted spring can be related to uniform circular 
motion vía projection. Simple harmonic motion is the same as the projection of uniform circular 
motion. The projection of simple harmonic motion through timegenerates sinusoidal motion. 

An object on a rotating circle moves with uniform circular motion, butthe velocity of its cor¬ 
respondí ng simple harmonic motion constantly varíes between máximum and mínimum valúes 
through time sinusoidally. The speed of an object vi brating in simple harmonic motion is propor- 
tional to both theamplitudeand the angular velocity of the correspondí ng unit circle. 

Peak pressure level of a sound wave is the difference between theambient pressure level and the 
magnitude of either the máximum or mínimum pressure level of the sound wave, whichever is 
greater. The peak-to-peak pressure level of a waveform is the magnitude of the distance between 
itslowestandhighestpoint.Therootmeansquared(RM S) valueof awaveformisauseful measure 
of energy in a sinusoid, calculated by squaring the waveform to derive its mean valué and then 
taking the square root of the mean valué to determine the RM S valué. Technically, this operation 
is val id only forsinusoids. 

Since it'seasierto measure pressure vari ations i n ai rthan sound ¡ntensity, weadapt the deci bel 
of sound ¡ntensity level (dB SIL) to the decibel of sound pressure (dB SPL) by doubling the 
dB SIL valué. 




Psychophysical Basis of Sound 


Pongileoni's bowing and the scraping of the anonymous fiddlers had shaken the air in the great hall, had set 
the glass of the Windows looking on to it vibrating: and this in turn had shaken the air in Lord Edwards' 
apartment on thefurther side. Theshaking air rattled Lord Edwards' membrana typani; the interlocked 
malleus, incus, and stirrup bones were set in motion so as to agítate the membrane of the oval window and 
raise an infinitesimal storm in the fluid of the labyrinth. The hairy endings of the auditory nerve shuddered 
like weeds in a rough sea; a vast number of obscure miracles were performed in the brain, and Lord Edwards 
ecstatically whispered 'Bach!' 

— Aldous Huxley, PointCounter Point 

The length of strings is not the direct and immediate reason behind theforms [ratios] of musical intervals, 
ñor is their tensión, ñor theirthickness, but rather, the ratios of the numbers of vibrations and impacts of air 
waves that go to strike our eardrum. 

— Galileo Galilei, "Two New Sciences" 

We must distinguish careful ly the ratios that our ears realIy perceive from those that the sounds expressed as 
numbers inelude. 

— Leonhard Euler, "Conjecturesur la raison dequelquesdissonancesgénéralementreguesdans la musique" 

What has been said of sonorous bodies should be applied equally to thefibres which carpet the bottom of the 
ear; these fibres are so many sonorous bodies, to which the air transmits its vibration, and from which the 
perception of sounds and harmony is carried to the soul. 

—Jean-Philippe Rameau, "Generation Harmonique" 


6.1 Signaling Systems 

Supposeyouandl wereabouttoplayaduet.Inordertostart,I mightsignalyoubysaying,"Ready? 
One, two, three_" 

For there to be a signal, there must be a source, a receiver, time, distance, and a médium—in 
this case, air—which spans the distance and connects the source to the receiver. Altogether, this 
consti tutes a signaling system. A signal isa physical ly detectable quantity such as the pressure of 
an acoustical wavethat traverses a signaling system. M ore general ly, a signal is a descri ption of 
how any one parameter varíes with any other parameter. A system is any function that produces 
an output signal based on an input signal. 



150 


C hapter 6 


Acoustics isthestudy of signáis and signaling Systems wherethe médium isair. The fulI treat- 
ment of acoustics covers these elements: 

■ Source— how sounds arecreated, including the mechanics of vibrating Systems of all kinds. If 
only musical sources are considerad, the subject is musical instrument acoustics. 

■ Médium— how sound behaves in air, including how sound istransmitted through air (spreading) 
and what happensto italong theway if itencounters obstad es such as walls (absorption and scat- 
tering) that cause transmission losses. Scattering happens, for example, when a sound strikes a 
wall: some is reflected, the rest is transmitted through the wall. Room acoustics is the study of 
sound transmission in rooms. Interferencefrom other sources produces an ambientnoiselevel that 
may block or degrade the reception of a signal by a receiver. 

■ Receiver— how wehear. Ifthis entire subject is acoustics, and whatgoeson between ourears 
is psychology, then the subject of how we hear is psychoacoustics. The question of how objec- 
tive measures of sensory stimuli relate to the subjective experience is the concern of psycho- 
physics. 

Thepropertiesof thereceiving end of the human sonic signaling system are covered i n thi s chapter, 
and the properties of sources and media are covered in chapter 7. 

6.2 TheEar 

C onsider the probl ems that our heari ng hel ps us to sol ve. T he ears detect, analyze, and el assi fy b¡ o- 
logically interesting sounds: they compilespectral and temporal information of incoming signáis, 
parse them into various sources, localize these sources in time and space, and construct a model 
of the auditory scene that surrounds us. 1 In a lectura I once heard, Albert Bregman characterized 
our hearing faculty asfollows: Suppose wescraped outtwo mi ñor indentations (representing our 
ear canals) on the edge of a vast Iake (representi ng our sonic envi ronment) and instalIed two fIoats 
i n them (correspondí ng to our ear drums). Suppose that by si mply observi ng how the waves moved 
the floats up and down, we were somehow ableto understand everything that was happening in 
the lake— to correctly identify boats going by, to note their position, to distinguish boatsfrom fish, 
wind, reflections, and so forth. Our hearing effectively carries out all thesefunctions and others 
with little morethan this kind of arrangement. 

The auditory system isattuned notonly to listen to certain sounds butto ignore sounds that are 
not biologically relevant. Such sounds inelude ambient noises and the effeets of sound reflection, 
refraction, and diffusion, which together can produce ugly distortions of a sound source. If we 
notice these secondary signáis at all, it is to use them constructively to characterizeouracoustic 
envi ronment. No information is wasted by the auditory system. If weadd to this the fact that our 
audition isal so capableof carrying usaway i n transports of rapturewhenwe hear music that moves 
us, this is an extraordinary faculty indeed. 



Psychophysical Basisof Sound 


151 



Figure 6.1 

Schematic diagram of the human ear. 

Figure 6.1 is a simplified drawing of a cross-section of one human ear showing the outer ear, 
middle ear, and innerear. 

6.2.1 Outer Ear 

The outer ear consists of the pinna (the part that sticks out from your head), the auditory canal 
(meatus), and the eardrum (tympanum). 

Thefunnel-shaped pinna helpscollect sound from the environment. I ts shape mod ifi es the arri vi ng 
frequency information depending on thedirection of the sound source, imprinting directional clues 
that we use to i dentify the I ocati on of the source. E ven the shape of the head and torso and the di stance 
between the ears influence how we identify direction. Frequencies around 3000 Hz are transferred 
most efficiently by the meatus, and this is the frequency range of our greatest hearing sensitivity. 

The tympanum is bent by the forcé of arrivi ng sound and transmits the motion to the middle ear. 
Although it vibrates most easily between 1 and 3.5 kHz, it transmits sound to the inner ear over 
the entire audible frequency range. 2 It has a conical shape, a highly detai led and fibrous structure, 
and an angular placementin the ear canal. Thefunction of theeardrum and the middle ear istopro- 
vide mechanical advantage to resol ve the mismatch between the density of air in the outer ear and 
the fluid of the i nner ear. Withoutthis impedance matching, very little acoustical energy would be 
absorbed by the inner ear and hearing would beseverely limited. líisstill largely a mystery how 
the tympanum accomplishes this task over such a wide frequency range. 

6.2.2 Middle Ear 

The middle ear is the chamber immediately behind the tympanum. It is connected to the throat by 
the Eustachian tube, which allows air pressure behind the tympanum to normalize to external air 



152 


C hapter 6 


pressure. A mechanical I inkage System couples vi brations arrivi ng atthetympanum to the inner 
ear, consistí ng of three tiny bones cal led ossicles, known as the hammer (incus), the anvil 
(malleus), and the stirrup (stapes), named for their shapes. The hammer is attached to the tympa- 
num, and the stapes is connected to the oval window leading to the inner ear. 

Since the outer ear ¡sin ai rbut the inner ear ¡sinfluid, the density differencebetween them would 
allow very little energy from the air to penétrate into the inner ear were it not for the leverage pro¬ 
vi ded by the ossicles. For instance, go to aswimming pool and haveafriend talk toyou asyou put 
yourhead under water. You mightstill be able to hearyourfriend asyourhead goesunder, but the 
sound isweak and muffled. M ostof the sound from yourfriend'svoice bounces off the water, back 
i nto the ai r, because of the difference i n density between thetwo media. The middleear passesalong 
sound energy to the inner ear by providing a mechanical leverage of about25tol using the ossicles 
to move the denser inner ear fl u¡d. That is, the middle ear matches the impedance of air to the i nner 
earfluid (seevolume2,chapter8). M ostof the mechanical energy presentinthetympanumistrans- 
mitted efficiently to the ¡nner ear, although the outer ear and middle ear transíer frequencies in the 
range of 1-3.5 kHz about 50 times more efficiently than frequencies outside this range. 3 

Acoustic ReflexandTemporaryThreshold Shift The middle ear also hasafew small mus- 
cles that can temporari ly protect the i nner ear from i ntense sounds. T he stapedius muscl e reduces 
the mobility of the ossicles by pulí i ng the stapes to the side. These muscles are activated by the 
bilateral acousticreflex within about 10-20 msof when sound pressureexceeds90-100 dB. The 
acoustic reflex provides about 20 dB of protection. However, the response time for this reflex is 
about 30-40 ms after the sound has started, and full protection takes up to about 150 ms longer. 
T he route from the auditory nerves to the stapedius i s hardwi red i n the brai n, so the acousti c refIex 
isordinarily below cognitivecontrol (although some individuáis, including theauthor, can volun- 
tarily actívate it). Thus for explosions and gunfire, damageto the ear can take place before these 
natural protections come ¡nto play. This strongly suggests the use of artificial noise suppression 
via ear plugs where explosive sounds are a possibil i ty. 

The tensor tympanl muscle is attached to the mal leus and i ncreases the tensión on the eardrum 
as part of a more general acousti c ref I ex to I oud sounds that can take as I ong as 1 or 2 seconds. T hese 
systems are not fail-safe protections. Extended exposure to loud sounds (in excess of 100 dB or 
so) fatigues these mu se les, reexposing the i nner earto punishing sound levelsand risking hearing 
damage. However, ourears do not al ert us to thiscondition. Instead, another longer-term protective 
mechanism comes ¡nto play, thetemporary thresholdshiñ(TTS), 4 whereby our hearing gradually 
adjuststo ongoing el evated i ntensi ty Ievels, and we I ose the sense that the sound is too I oud. I f the 
hair cellsarenot allowedto recoverthrough periodsof relative quiet, they gradual ly losethei r abi I- 
ity to respond, and they die, resulting in permanent hearing loss, or permanent threshold shift. 
In addition to damaging the auditory mechanism, noise may contribute to loss of sleep, tensión, 
headaches, reduced visión, sexual impotence, heart disease, and even mental illness (Cohén, 
A nticagl ia, and Jones 1970). The moral: 


Too much noise ¡s bad for you! 



Psychophysical Basisof Sound 


153 


6.2.3 InnerEar 

Thestapesconnectsto the oval window atoneend of the cochlea, afluid-filled tubethatconnectsto 
the auditory nerve. The cochlea isacoiled doubletube, connected atthecenter. Figure 6.1 shows it 
uncurled. One end of the doubletube i s the oval window, theother end is a round window, which is 
alsocovered with amembrane.Theoval window si de of the cochlea i s the scala vestibuli. The round 
window si de of the cochlea i s the scala tympani. At the apex of the cochlea, these two scala are con¬ 
nected by a narrow aperture, theAieí/cotrema. Thetwo scalaarefilled with per/7ymph, which issimilar 
to cerebral spinal fluid. As the oval window is vibrated by thestapes, the perilymph moves back and 
forth. The membrane over the round window is pushed in and out in acomplementary motion. 

The scala vestibuli and scala tympani endose the scala media, filled with endolymph, similar 
to intracellular fluid. Within the scala media is the organ ofCorti, which is the receptor organ for 
hearing. It rests on part of the membranous labyrinth, the basilar membrane. 

6.2.4 Basilar Membrane 

The basilar membrane runs down the center of the cochlea. A bout 30,000 hairlike receptor units 
called hair cells (cilia), are attached to it along its length. On the other end, the hair cells are 
anchored to the more stable tectorlal membrane. The hai r cel I s connect the two membranes al ong 
thei r enti re Iength. There is a row of inner hair cells and several rows of outer hair cells. The inner 
hair cells provide most of the afferent Information to higher neural centers. 

Thebasilarmembranevibratesunderthepressureoftheperilymphinresponseto sound. Itisthin- 
ner, stiffer, and narrow er at the base of the cochlea than atthe apex. Imagine a guitar string that is 
thicker atone end than theother: thethin end will vibrate more readily at high frequenciesthan the 
thicker end. Thus, for a puré tone of given frequency, only one relatively narrow región on the basi¬ 
lar membrane vi brates sympathetically. L owfrequencies vibrate the perilymph most intensely atthe 
apex of the basiIar membrane, and high frequencies vi brate it most intensely near the oval window. 
Thus, the position along the basi lar membrane encodes frequency for the auditory nerve. In thelan- 
guageof psychoacoustics, the basilar membrane transforms frequency to place. According to ideas 
originally put forth by G. S. Ohm and Helmholtz, the basilar membrane was thought of as a kind 
of spectrum analyzer that maps frequency to position. Figure 6.2 shows a map relating frequency 
to position along the basilar membrane. As the basilar membrane is vibrated, the hair cells are 
sheared back and forth between the tectorial membrane and the basi lar membrane. Fl ai r cel Is receiv- 
ing significant movement trigger an electrical signal that is transmitted to a nerve lying under the 
organ of Corti. These neuronstransmit signáis back along the auditory nerve to the brain stem. 

Figure 6.2 shows that about half of the basi lar membrane is used to encode frequenci es between 
25 Hz and about 1.6 kHz. All the remaining frequencies in the range of human hearing— from 
1.6 kHz to about 20 kH z— fit i nto the remai ni ng half of the area. Perhaps not surprisi ngly, we have 
greater difficulty discriminating higher pitches than lowerones. 

PlaceTheory If the frequency of a tonedoubles, the position of máximum displacement along 
the basi I ar membrane moves toward the oval w¡ ndow by a constant amount. Thi s suggests that the 



154 


C hapter 6 


25 Hz 100 Hz 400 Hz 1600 Hz 



Distancefrom theoval window (mm) 


Figure 6.2 

Frequency response of the basilar membrane. (Adaptedfrom Békésy 1960.) 

basilar membrane encodesfrequency ratios, not frequency differences. Here is physiological evi- 
denceof thelogarithmic relation between pitch and frequency: the basilar membrane uses a log- 
arithmic encoding for pitch. This observation, thep/ace theory of pitch, holds that there is a di rect 
relation between the frequency presented to the basilar membrane and the place along its length 
that is displaced most strongly. M ore generally, the place theory holds that there is a tonotopic 
mapping between the basilar membrane and an associated región of the auditory cortex that per- 
forms frequency discrimination based on thetopology of the basilar membrane. 

Frequency Sharpening But there is at least one problem with the frequency-to-place theory. 
The curves in figure 6.2 suggest that our ability to discrimínate between two cióse frequencies 
should bemuch poorerthan itactually is. lnfact,ourhearingdoesamuchbetterjobthanonewould 
predict from the passi ve mechanics of the basi I ar membrane. K achar et al. (1986) di scovered a pos¬ 
si ble explanad on. They observed through video microscopy that outer hair cellschange length in 
response to nervestimulation. Ashmore(1987) stimulated a single outer hair cell and observed its 
length change substantially. The effect persisted at frequencies into the kilohertz range. Current 
thi nki ng i s that outer haircellshelpto sharpen the tuni ng of the basi I ar membrane by affecti ng how 
it vibrates, directing and focusing the responsiveness of the inner hair cells. It seems that sound 
analysis in the cochlea is influenced by adynamic neurophysiological feedback process. 

6.3 Psychoacoustics and Psychophysics 

Theaim of this section is to develop a simple model of the hearing system. The psychologically 
relevant characteristics of music inelude pitch, loudness, timbre, duration, amplitude envelope, 
spectral envelope, consonance, volume, rhythm, vibrato, and sound location information. 

Psychoacoustics is the Science of how we perceive sound. An ¡nterdisdplinary field, it draws 
upon physics, biology, psychology, engineering, and music. Psychoacousticsstarts with thebasic 
subjective attri butes of sound as we perceive i t and seeks to understand the ways these percepti ons 
reí ateto each other. 



Psychophysical Basisof Sound 


155 


Psychophysics focuses just on the crossover point where physics leaves off and psychology 
begins— wherethe objectively observable stops and thesubjectivestarts. I ts ai misto developmet- 
ricsthat relatetheexternal physical variables of sound (the O variables) to the internal psychoa- 
coustic variables (the W variables). 5 For example, the O intensity of a sound can be quantified 
easily by direct measurement (see section 4.24). The corresponding 'P variable is loudness. The 
¡dea that the'F variables could be quantified was first suggested by G. T. Fechner in the 1860s 
(Alien and Neely 1997). 

6.3.1 Science and Perception 

Several problems exist in developing objective measures of our perceptions of sound and 
music. 

■ Subjectivity Objective measurement is a cornerstoneof the scientific method, but perception 
of music and sound is subjective and not di rectly availableto objective measurement. For exam¬ 
ple, it would be nice to have an objective measure that relates O sound intensity to 'P loudness. 
But we can no more directly apply objective measurements to subjective States than we can 
developathermometerforhappiness. Subjective States are only indi rectly availablefor objective 
observad on. 

■ Nonlinearity ¥ vari ables are often notlinearly proportional totheir corresponding O variable. 
Pitch and loudness are cases in point. 

■ Nonorthogonality Y variables often influence each other in quixotic and counterintuitive 
ways. For instance, O frequency clearly has a major impact on ¥ pitch perception, but O sound 
intensity al so has an i mpacton'P pitch. Intwo-dimensional Cartesianspace, thexandydimensions 
are orthogonal and x and y can vary independently. Pitch and loudness are nonorthogonal. 

6.3.2 Science Is Limited 

Psychoacoustic research must rely on experimental methods that externalize the inner experi- 
ences of I i steners. We can use such i nformati on to construct model s of how human heari ng func- 
tions. But there are many problems with this approach. For example, there is the problem of 
reconcilingdifferingresultsduetoconflictingexperimental methodology.Supposeweconduct 
loudness experiments using noise bursts as stimuli; how do we relate our results to another 
experiment that used puré tones? How do we relate either of these to an experiment that used 
orchestral instruments? This is like surveying the ocean by sampling ¡ts depth in only a few 
places. What ¡fin one experiment weask subjectsto evalúate how "agreeable" a musical interval 
is, but i n anotherweask how "consonant" theinterval i s? H ow shall wereconcilesuch semantic 
differences i n experi mental desi gn? Psychol ogi sts thus face a probl em not uní i ke that descri bed 
i n the anci ent tal e of the bl i nd men and the el ephant. 6 A nyone f ol I ow i ng the prog ress of sci ence 
must Iive with the suspense of an unsolved mystery. To those who are not a part of the conver- 
sation, scientific discourse can be very much like tuning into a heated talk radio program— in 
G reek. 



156 


C hapter 6 


6.3.3 Science I s M essy 

The ¡deas developed by Science that seem effective usually result in a body of explanatory liter¬ 
atura that describes the mind-set, or paradigm, that these ¡deas represent. Upheavals in this 
mind-set occur at unpredictable intervals when new, more expressive models of the subject 
emerge. The valid kernels of truth within the oíd paradigm (if they exist) are incorporated as a 
component of the new paradigm. H owever, i t i s not al ways the case that a new paradi gm is si mpl er 
than the oíd; it may assertthe importanceof previously ignorad or undiscovered elements, thereby 
actually complicating matters. 

Sometí mes the discarded elements of oíd mind-sets persist long after they are shown to be lim- 
ited or erroneous. For whatever reason—social convenience or aesthetics—they linger on. An 
exampleof this phenomenon is the so-cal led psychophysical law that el ai ms that the reí ati onshi p 
between O intensity and 'P intensity isalways logarithmic. By this "law" the multipl i catión of O 
intensity by some amount purportedly always produces a correspondí ng addition to the perceived 
'P intensity. This concept is often associated with Weber’s law, 7 which says that as the intensity 
of a stimulus increases, the ability to detect a difference between two levels of the stimulus 
decreases. In fact, I tacitly referred to this when I described the motivation for constructing the 
decibel scalein chapter4. 

U nfortunately, the rationale behind the decibel scale as a measure of loudness is inadequate at 
least i n part because it ignoresthefactthat our heari ng varíes i n its sensitivity to differentfrequency 
and intensity ranges. Decibel measureisn'tusedanymoreto measure‘P intensity, butitisstill valu- 
able as a measure of O intensity in engineering disciplines, for instance, in designing and using 
recording equipment. 

I suppose we could come up with a crude metric of the complexity of a subject by tallying up all the 
partí al expl anati ons and conf I i cti ng theori es that are currentl y extant about i t, and then mui ti pl y i ng that 
by the number of years scientists have been studying the problem. The development of a scientifie 
model of human heari ng has been under way for at I east 140 years, si nce the early work of F echner, and 
wearestill nowherenear having an established body of laws. By this measure alone we can see that the 
auditory system is hugely complex, containing redundancies, contradi cti ons, and even deceptions. 

Somethings, such astheouter limitsof loudness and pitch, areby now well established. Fl ow¬ 
ever, though I try to restrict this discussion to just the settled faets, there is no mistaking this ter- 
ritory for the comforts of home. The best advice I can offer to the i nterested reader is to buy a radio 
and start learning Greek! 

6.4 Pitch 

Pitch is the subjective *P variable correspondí ng most closely to the objectiveo variable fre- 
queney. Pitch is sometí mes cal led theresponsepattern tofrequeney. Butthere’sno simpleequal- 
ity between them. While our sense of pitch is roughly proportional to frequeney, it is also 



Psychophysical Basisof Sound 


157 


influenced by frequency range, loudness, and the presence or absence of other frequencies. 
A nother difference i s that pitch is limited to our range of heari ng (17 H z to 17 kH z) but frequency 
is unlimited. 

A commonly quoted definition of pitch given by the American National Standards Institute 
(ANSI 1999) says, "Pitch is that auditory attribute of sound according to which sounds can be 
ordered on a scale from low to high." Unfortunately, stipulating precisely what "that auditory 
attribute" isturnsoutto besurprisingly complex. 

A sound ispitched if itswaveshapeishighly redundantthrough time. Otherwisewehear noise. 
Even a pitched tone must have a certain mínimum duration for its pitch to be perceived; otherwise 
it is heard as a click. Tones with rich harmonio spectra will appear to have a moredefinite pitch 
than si nusoi ds, si mpl er harmonio spectra, or i nharmonic spectra. Very compl ex i nharmonic spectra 
may appear to haveseveral pitches. In the case oflarge bel Is, the fundamental, or hum note, isnot 
the same as the perceived pitch of the i nstrument, the strike note. 

6.4.1 Pitch Perception 

G. S. Ohm (1843) first putforward a theory that the ear derives pitch by performing Fourier anal- 
ysis on acoustical signáis (see volume 2, chapter 3). Ohm’s theory, sometí mes called Ohm's law 
ofacoustics, which hedeveloped just after Fourier'soriginal work, wasperhaps the first place the¬ 
ory of pitch. One of the predictions of this theory is that the ear should be reíatively insensitive to 
phase information, which has been shown generalIy to betrue. 

But place theory fails to account for how the ear organizes frequency components into tones 
instead of hearing all frequencies as unique pitches. A Iso, because of the nature of the Fourier 
transform, place theory implies a one-to-onecorrespondencebetween frequencies intheacousti cal 
signal and pitches that the ear should detect. Butwe sometí mes hearphantom pitches wherethere 
is no energy in the signal. How can that be? 

T he Missing Fundamental The placetheory of Ohm hita majorstumbling block with anexper- 
imentperformed byAugustSeebeck (1841). Supposel play two tones for you: oneis a puré sinu- 
soid, the other is pitched but complex (having many harmonios). You can adjust the pitch of the 
puré tone with a knob. Yourjob isto adjust the pitch of the puré tone to match the pitch of the com¬ 
plex tone. It is virtually certain that you will adjust the frequency of the puré tone to the funda¬ 
mental frequency of the complex tone even ifthere is no measurable energy at the fundamental 
frequency (see section 2.8.1). 

Suppose the partíais of the complex tone are 300, 400, and 500 Hz. You will most likely dis- 
tinctly hear a "fundamental" at 100 Hz, the greatest common factor of the overtones. You will not 
hearan ¡nharmonic tone with fundamental at 300 Hz. So convinced are our ears of the ubiquitous 
phenomenon of a fundamental with harmonics at integer múltiples that even if there is no funda¬ 
mental, our hearing is hardwired to invent one. This means that Ohm’s theory, which requires a 
one-to-one correspondence between frequencies and pitch, runs into the contradiction of a pitch 
with no corresponding frequency. 



158 


C hapter 6 


The phenomenon of the missing fundamental is what enables us to hear satisfying music come 
from the tiny speaker of a transistor radio: our hearing invents the fundamentáis that the speaker 
can’t reproduce. 

Periodicity Theory The explanation that Seebeck provided as a substitute for O hm's place the- 
ory carne to be called periodicity theory. It was developed further in the 1940s by Schouten, 
Ritsma, and Cardozo (1962). This theory supposes that the neural signáis from the cochlea to the 
brai n encode ti mi ng i nformati on reí ated to the phase of the acousti cal si gnal and that the brai n has 
some means of measuring time intervals. 

Periodicity theory notes that the combination of several high harmonics can sum to create a 
waveform w ith promi nent ti me domai n features w hose peri od i s the same as that of thei r common 
fundamental. T his way, a pitch peri od-measuri ng capabi I ity i n the brai n would get more or less the 
same i nformati on from a tone with or without a fundamental. 

Periodicity theory al so explains why amplifying the electrical activity in the auditory nerve 
results in an electrical si gnal similar to the acousti cal si gnal presented to the ear. 

However, it is neurological ly impossibleforneuronstofiremorerapidly than aboutl ms, cal led 
the absolute refractoryperiod. So periodicity theory runs into troublefor pitches above 1000 Hz. 
Another difficulty is that periodicity theory would lead us to expect the ear to be quite sensitive 
to the phase of the harmonics in complex tones. However, place theory—that the ear largely 
ignores phase— agrees very well with experiments. Ohm had suggested that perception of sound 
depends only on the distribution of energy among partíais and does not depend upon differences 
of phase. The physical demonstration of this was considered a major accomplishment of 
Helmholtz, and the theory waseffectively unchallenged for a century. 

Beyond thePeripheral Theories Clearly, place and periodicity theories have merit and also 
liabilities. Both suppose that acousti cal Processing occursin the peri pheryof the auditory system: 
the basilar membrane and lower nerve centers of the auditory cortex. So these theories are called 
jointly the peripheral theories. 

The main drawbacks of the peripheral theories are (for periodicity theory) sensitivity to the 
phase reí ati onshi p between parti al s, and (for pl ace theory) the i mpossi bi I i ty of expl ai ni ng the mi ss- 
ing fundamental in spectral terms. 

Also, experiments with dichotic signáis (where different informad on is sent to each ear sepa- 
rately via headphones) have demonstrated a necessary role for the brain in pitch detection. 
Houtsmaand Goldstein (1972), for example, demonstrated thatwestill manageto hear a missing 
fundamental even if some harmonics are sent to one ear via headphones and different harmonics 
of the same fundamental are sent to the other ear. This shows that the brain must be the agent that 
combines the harmonics to determine the fundamental, becauseif thiswerehandled peri pheral ly, 
we would hear different pitches in each ear rather than a single fused tone. 

Difficulties with peripheral theories and experiments with dichotic signáis led researchers to 
centrai Processing theories of pitch perception that emphasize central Processing conducted in 
the brain (Goldstein 1973; Wightman 1973; Terhardt 1974). These theories presume that 



Psychophysical Basisof Sound 


159 


pattern-matchi ng systems i n the brai n search for order i n the componente arrivi ng f rom the peri ph- 
eral auditory system. Pattern matching accounts well for our ability to detect the fundamental of 
stretched harmoni es of a p¡ ano tone, and to di g harmoni c i nformati on out of i ntensely noi sy si gnal s. 

Assuming higher neural Processing for pitch perception al so helps explain thefactthatwecan 
learn pitch discrimination. W hen I taught solfeggio and sight-singing in college, I observed over 
thecourseof the semester that students’ capacity to discri mínate and categorize pitch improved, 
sometí mes dramati cali y. (See section 9.22 for a discussion of self-learning neural systems.) 

Pitch perception remainsoneof psychoacoustics' longest-running controversies, with an unbe- 
lievable number of competing theories. Perhaps the theoretical difficulties are a consequence of 
the importance of pitch perception to survival. A faculty this critical to Ufe can't be entrusted to 
only one adaptad on; redundaney and competitive analysis i n both theperiphery and the brai n are 
required. 

6.4.2 Rangeand Quality of Pitch Sensation 

W hen I i ndi cated that the range of heari ng i s 20 H z to 20 kH z, that was j ust to throw out some round 
numbers that are easy to remember. I n fact, the boundari es arefuzzy and vary enormously wi th age, 
gender, and Ufe experience. 

Atthetop end, ayoung person in good health might be ableto hear up to 17 kHz or so. Adults 
lose the top end until, nearing oíd age, it might be down to around 12 kHz for women and 5 kHz 
for men. 

Pitch discrimination drops off above about 5 kH z for al I of us, which perhaps explai nswhy few 
musical i nstruments are designed to i ntone beyond that range. The highest note on the piano is C8, 
4186 Hz. 

Atthelow end,soundsbelow about 30 Hzbecomeprogressively harderto hearashaving a pitch. 
Below thatfrequency, we start to feel sound as physical impact. The lowest note on the piano is 
A0, 27.5 Hz. 

The range offinest perception both in termsof pitch and loudnessisbetween 1 kHz and 4 kHz, 
which coincidently is where most speech information occurs. 

6.4.3 J ust Noticeable Difference of Pitch 

Two i mportant attri butes of a rul er are i ts I ength and the f i neness or preci si on of measurements that 
can be made with it. 8 If the range from lowest to highest frequeney the ear can hear corresponds 
to the length of some kind of ruler, then to what perceptual quality does the precisión of measure- 
ment correspond? 

If the difference between two pitches is not noticeable, we judge them subjectively to be the 
same, whether they are physical Iy the same or not. RecalI that E uler wrote, "The sense of heari ng 
is accustomed to identify with a single ratio, all the ratios which are only slightly different from 
it, so that the difference between them be al most imperceptible." 

Effectively, pitches mustdiffer by a mínimum threshold for usto distinguish them. This thresh- 
old is the justnoti cea ble difference (JND) of pitch.Th epitchJND is the measure of sensitivity of 



160 


C hapter 6 


theeartochangesin pitch. Itissometimescalled thepitchd/fference//'men, orpitchDL. How well 
the ear can distinguish between adjacent pitches determines the precisión of our hearing. 

6.4.4 TheWeber-Fechner and StevensLaws 

I nterestingly, the size of the pitch J N D is not constant. TheJ N D of high frequencies covers a larger 
span of frequenci es than theJ N D of I ow frequenci es. A ttri buted to E rnst Weber, theJ N D i s a d as- 
sic psychophysical invention that has been applied not just to the senses (e.g., color and taste) but 
even to the pri ce of houses. In general, Weberobserved that thegreater the magnitudeof a sti mulus, 
the greater must be the change in that sti mulus before any difference is detected. 

Weber’slaw correctly predicts that thej ust noticeable difference of pitch growswith increasing 
magnitude (greater magnitude means, i n this case, higher frequency). I f we cal I the size of thej N D 
Al and the magnitudeof a comparison stimulus/, Weber’s law says that 

y = k, ) ust Noticeable Difference (J ND) (6.1) 

where /risa constant of proportionality. The parameter k takes on different valúes for different 
sensory sti mui i. 

G ustavT. Fechner (1801-1887) based h¡s work on Weber’sJ N D but refined it by suggesti ng that 
for many percepts (including pitch and loudness), a geometric increase in O magnitude is per- 
ceived as an arithmetic i ncrease i n ¥ magnitude. Thus, ¥ magnitude increases i n proportion to the 
logarithm of the O magnitude, and large changes in O are compressed into smaller changes in ¥. 
Fechner’s law can be expressed as 

-^-=k, Weber-Fechner Law (6.2) 

logo 

where 'F is the magnitude of the sensati on, O is the magnitude of the sti mulus, and k is the constant 
of proportionality. Itwas Fechner’s work that led to the theoretical underpinnings of thedecibel. 

Experiments haveshown that the Weber-Fechner law works better for some stimuli than others 
and generally works best for stimuli of médium intensity. Stevens (1962) generalized the 
Weber-Fechner law soitcould be applied morewidely. H e suggested that 'F magnitude increases 
in proportion to theO magnitude raised to a power: 

— =k, Stevens Law (6.3) 

O p 

where'P is the magnitude of thesensation, O is the magnitudeof the stimulus, p is itsexponent, 
and k isa constant of proportionality. 

A logarithm and a powerfunction can bemadeto resembleeachotheriftheexponentis between 
Oandl. Forexample, comparecurvesland2infigure6.3a. Curve 1 isa power law approximation 
of the Weber-Fechner log curve 2. Since even in the best of circumstances we can only estímate 



Psychophysical Basisof Sound 


161 


a) Weber-Fechner law 


b) Stevens law for various p 

v = k<¡> p 



Figure 6.3 

Comparison of the Weber-Fechner law and the Stevens law functions. 

the'P/O relation (because'F issubjectiveand wecan'tobjectively measureit),thetwoapproaches 
are reasonably interchangeable. H owever, for stimuli such as the apparent length of an event, the 
degree of compression between O and W domains isless than the Weber-Fechner law would pre- 
dict and may be better modeled with a power exponent greater than 1.1n these cases, <f> changes 
can produce equal or even larger changes in W. In cases like this, the Stevens law, iIlustrated in 
figure 6.3b, provides a richer range of mappings to experimental data and has been widely used. 

6.4.5 Determining Pitch J N D 

Flow are such metrics established experi mental ly? Forexamplethe pitch J N D can bedetermined 
asfollows. Supposel play asequenceoftwosinusoids, both with the same loudness. The first tone 
has a constant pitch; the second tone has a smal I vi brato. Thisallowsyou to tell thetwo tones apart. 
A s the subj ect of the experi ment, you must tel I me each ti me whether the pitch of the second tone 
is "above" or "below" the first. (Saying, "the same" is not an option.) 

This process is a simplified versión of the experimental method calIed two-alternative 
forced-choice (2AFC). If the difference between the tones is large, your judgments will tend to 
be categorical. But where the difference is slight, your answers will become increasingly arbi- 
trary, and when thefrequenciesaretoocloseto be distinguished, your answers will beeffectively 
random (rightapproximately halfthetime).Theexperimenterexaminesyourresponses, looking 
for the range over which your responsestransitionfrom 100 percentcorrectto random (50 per- 
cent correct). The midpointof this transition zone, around 75 percent, istaken to be the J ND at 
that frequency. 

Such a method could telI us theJ N D of any frequency, but only for sinusoids (because that's alI 
we tested with). What about other sounds— sustained sounds, short sounds, sounds with varying 
pitch, sounds with steady pitch or quickly varying pitch, simple sinusoids vs. complex tones? If 
complex (read: musically interesting) tones are used, which complex tones shall we compare? AII 
these parameters (and more) will havean effect on the pitch JND weend up measuring. 



162 Chapter 6 



Figure 6.4 

Just noticeabledifferencefor pitch. (Adapted from Roederer 1973.) 

Psychophysicists have traditionally taken a bottom-up approach to such questions. If they can 
get a theory right for si mple steady-state si nusoids, they figure they can use i11ater to explain more 
complex phenomena. I mustsay,asamusic¡an, I amalwaysdisappointed bythisapproach because 
itseemsthattheelementary results of psychophysics are almost uselessly simplistic in realistic 
musical situations. On theother hand, correct butIimited knowledgeisbetterthan none(and iscer- 
tainly a big improvement on erroneous informadon or superstition). 

TheJ ND of pitch has been found experimentad y to depend not just upon frequency but al so 
upon intensity and duration as well as the rapidity of frequency change. The heavy line in fig¬ 
ure 6.4 shows thepitchJ N D for constant-intensity (80 dB) sinusoidswhosefrequency wassiowly 
and continuously modulated up and down. The light I ines show several J ND thresholds for refer- 
ence: 0.5 percent and 0.6 percent, 1 percent and 3 percent. We observethat the heavy line mostly 
lies between 0.5 percent and 0.6 percent. 

The figure shows frequency fon thex-axisand the correspondí ng detectable frequency differ- 
enceAfon they-axis. The ratio of Af/f, sometí mes cal led the frequency resolution of theear, shows 
the pitch J N D for frequencies between about 30 and 5000 Hz. The closer this line is to thex-axis, 
the smal I er i s the J N D. Forexample,wedon'tseem to noticea differenceof lessthan +5Hzaround 
1 kHz; thus theJND, expressed as a percentage, is 0.5 percent at 1 kHz. We also don't seem to 
noticea differenceof lessthan +30 Hzfortones around 5 kHz, or 0.6 percent. 

Note al so that 

■ The low and high ends have wider J N Ds, and the bottom end is worse than the top end. 

■ The most acute región is from 1 to 3 kHz, where theJ N D is about 0.5 percent of the frequency. 
For reference, that's about one twelfth of a semitone, or 8.3 cents (see section 3.4). 

■ Rapidly changing frequency fluctuations can produce J NDs up to 30 times as small. 




Psychophysical Basisof Sound 


163 


■ Shorter-duration tones produce larger J N Ds. 

■ Frequency resolution of theear is reíatively independent of sound intensity. 

J N Ds al so depend a great deal upon the individuáis tested and their degree of musical training as 
well as upon the methods used to measuretheJNDs. 

6.4.6 I nterval Perception 

I suggested thatwecould comparepitchJND to thetick markson a ruler, but, likeall analogies, 
this one has its I imits. Wouldn't it beconvenient if our ears measured pitch differenceasthe number 
of JNDs between pitches? Alas, it is not so, and it really can't be if wethink about it. 

W hile pitch J N D gives usan understanding of pitch similarities, theJND provides no informa- 
tion about how we judge pitch differences. The only thing thatj N D knowledge contributes to this 
subject is that pitches lying inside a J N D are experienced as the same while pitches lying outside 
aJND are experienced asdifferent, butJND saysnothing about the qual ity of that difference. We 
must address this question separately. 

Supposel play apairof sinusoids, onefixed, the other beginning in unisón with it but diverging 
fromitby slowly gliding up in frequency. Wemight hypothesizethatjustasthedifference between 
two points constantly increases as the points diverge in space, so too the ear should experience a 
constantly increasing difference in frequency as a constantly increasing difference in pitch. 

This hypothesis is partly true. We hear the tone height of the pitch that is gliding up continué 
to grow. H owever, ever more widely separated pitches do not always sound ¡ncreasingly different, 
as one would expect if the only thing the ears paid attention to was the number of J N Ds between 
pitches. Instead, as the distance reaches a doubling in frequency, the tones begin to sound alike 
again, asthey did when they were in unisón. This perception repeats at each subsequentfrequency 
doubling, an effect called octave equivalence (see section 2.3.3). The equivalence is felt so 
strongly that virtually all musical scalesaround theworld areorganized around the2:1 ratio of the 
octave, and pitches related by octaves are virtually always given the same ñame. 

I nterestingly, the octave, as a physical frequency ratio of 2:1, always corresponds exactly to 
the subjective pitch difference of an octave, making the octavea rare instance whereobjective 
and subjective measurements seem to match exactly. Perhaps this symmetry between object 
and subject is why pitch istheelementof hearing most heavily reí i ed upon to convey musical 
information. 

A11 this suggests that pitch i s more than a one-di mensi onal sense of hi gh and Iow. R évész (1954) 
developed a two-component theory oftone, suggesting that there are at least two principal inter- 
locking structures in pitch: 

■ The linear span of J N D pitch differences from the bottom to the top of our hearing range, which 
he cal led tone height. 

■ The circular span of interval differences within the compass of each octave, which he called 
chroma. Chroma refers to the position of a tone within an octave. 



164 


C hapter 6 



Figure 6.5 

Tone height and chroma. 


Wecan reconcile the concepts of tone height and chroma in two dimensions (Shepard 1982). 
Figure6.5 represents tone height along they-axis and chroma as an angle on a circle in thex-axis 
and z-axis. The combination results in a helix. The movement of a sinusoid from C4 to C5, for 
instance, is represented as a movement upward in tone height but as a return to the starting angle 
in chroma. 

Octave equivalence is perhaps just a very strong instance of interval affinity. Similar intervals 
are highly identifiable— a trait much exploited by musicians. Fourths show "fourthness" and 
fifths show "fifthness" regardless of their orientad on in pitch space. U nderstandi ng the musical 
qual i ti es of affi ne i nterval s i s one of the subj ects of harmony theory, w hi ch i n turn i s one of the sub- 
disciplines of music theory. 

There remai ns the probl em of how to actual ly construct useful musical scal es out of the conti nuum 
of available pitches within the chroma. Ordinarily, musicians select a small subset of intervals from 
the chroma, and these become the pitch classes of thescale. When the pitch classes are repl icated 
across each octave, they become the pitches of the availablep/'tcA) space, or gamut. I n the West, the 
scale has 12 chromatic pitches. We can visualize the pitch space of the equal-tempered scale as 
shown i n figure 6.6, which is a projection of figure 6.5 along they-axis. 9 Because humans can hear 
ten orso octaves, the spiral shows ten revolutions, where the outerones are the lower octaves. The 
121 ines radiad ng out are the 12 chromatic pitch el assesof the Western scal e. The set of points where 
the lines intersect the spi ral form the gamut of pitches of the equal-tempered scale (see chapter 3). 

To get a feel i ng for chroma and tone hei ght, perf orm the fol Iowi ng experi ment on a piano: start¬ 
ing from a low tone, play a sequence of major seventh intervals (separated by 11 semitones) up 
the keyboard. For instance, Cl, Bl, A»2, A3, G*4, G5, and so on. You may hear an ambiguous 



Psychophysical Basisof Sound 


165 



Figure 6.6 

Chromatic pitch space. 

effect: though the pitch rises by sevenths, you might also be able to hear the sequence as though 
it wer edecreasing bysemitones. W hile the tones of a major seventh interval are reí atively far apart 
in terms of tone height, they are cióse together in terms of chroma. Henee, if you focus on tone 
height, you hear the sequence ascend. If you focus on chroma, you hearitdescend. RogerShepard 
noticed this effect in his early research, which led ultimately to hisfamous ¡Ilusión. 

6.4.7 Shepard Scalel Ilusión 

Shepard (1964) wantedto test Révész'stheoryof tone height and chroma. If hecouldsuppressone 
of the two effeets and the other effect sti 11 persi sted, that would demónstrate that they are sepárate 
perceptual attributes of pitch. In particular, if Shepard could suppress the sense of tone height, 
chromashould beall thatisleft. Thehelixinfigure6.5 wouldcollapseinto acirelé, and pitch judg- 
ments would also become circular. Hedevised a demonstrad on of pitch circularity in 1964 that 
proved Révész’s theory. It has cometo be known as the Shepard scale ¡Ilusión or Shepard tone 
demonstrados 

A set of ten sinusoids at octave intervals is played asshown i n figure 6.7. The frequencies glide 
upsmoothly together, rising continuously in pitch (in someversions, they rise by semitonesteps). 
The intensity of thelow and high sinusoids is increasingly diminished, so theearmostly hearsthe 
si nusoids in the middle frequencies (i mplemented asaGaussian-shaped ¡ntensity contour). As the 
top sinusoid goes off the top end of the hearing range, it gradually drops below the threshold 
of hearing and a new sinusoid is i ntroduced from below. The whole effect is rather Ii ke the visual 
¡Ilusión of a barber pole in motion, or the impossible staircase of the visual artist M. C. Escher— 
constantly rising, nevergetting anywhere (figure 6.8). 10 

Theequahon for creadng the original Shepard tone ¡Ilusión based on movement by semitones 
isgiven by 

< ct max +t )/t n 


F(t,C) = F min - 2 


(6.4) 



166 


C hapter 6 


Threshold of hearing 



Figure 6.7 

Shepard scale ¡Ilusión. 



Figure 6.8 

I mpossible staircase. 

where F(t, c) is the frequency of the partí al c of tonef, t max is 12 (because this versión of theeffect 
i s based on the chromatic scale), and F min i s the frequency of the lowest partí al of the lowest tone. 
The range of t is 0 < f < t max , and the range of c is 0<c<N , where N is the number of partíais 
to begenerated. Shepard used N = 10. To createthefirst set of partíais, setí = 0 and evalúate (6.4) 
for all c. Forthe next step, incrementt by 1, and evalúate for all c again; repeatfor all t. It i s al so 
necessary to adjust the loudness of each pardal to achieve the contour shown in figure 6.7. This 
step and the modificad on to variables t and t max in equation (6.4) can be used to effect a smooth 
gl i ssando. N otethat the G aussi an envel ope shape i s g¡ven i n Iog frequency so that equal pitch i nter- 
vals occupy a uniform distance along the frequency axis. 

6.5 Loudness 

L oudness i s the subj ective 'P vari abl e correspondí ng most el osel y to the obj ecti ve O vari abl e i nten- 
s¡ ty. L oudness i s sometí mes cal I ed the response pattern to i ntensi ty. B ut there i s no si mpl e equal i ty 
between them.Whileoursenseof loudness is roughly proportional to intensity, itisalso influenced 




Psychophysical Basisof Sound 


167 


by frequency range and the presence or absence of other frequencies. Another difference is that 
loudness is limited to the distance between our threshold of hearing t h (10- 12 W/m 2 ) and the I i mi t 
of hearing l h (10° W/m 2 ) but intensity is unlimited (seesection 4.24). 

The loudness J ND istheamount by which the intensity of a sound must change in order for the 
earto register a difference in loudness. Thesizeof the loudnessJND isapproximately proportional 
to the intensity of the sound: the louder the sound, the greater must be the change in its loudness 
before the change in loudness is registered. (This is a restatement of the Weber-Fechner law for 
loudness.) H owever, the loudnessJ N D varíes substantially with frequency and intensity range, so 
thereisno simplelinear relation. A Iso, thereis no loudnessequivalentto theoctave, that is, judging 
a sound to be"twice as loud" shows a much greaterdeviation among subjectsthan does judging 
a pitch to be "an octave higher." 

6.5.1 Relating Pitch and L oudness 

As mentioned, the rationale of the decibel scale assumes our perception of loudness to be inde- 
pendentof all other percepts such as frequency, butitisnot. Becauseof themechanical advantage 
the ear gives to frequencies of 1-3.5 kHz, tones in this range are perceived as louder than tones 
of equal intensity in other ranges. 

Since loudness depends upon both intensity and frequency, a loudness scale properly requires 
three dimensions: the independent O variables frequency f and intensity / and the dependente 
vari able loudness!. Since we'redealing with perceptual variables here, wemustexplicitly testfor 
every relation wewanttomeasure. Thus, ifwewantto know when loudnesses are equal, we must 
develop a metric fortheequality of two loudnesses atdifferentfrequencies. Ifwewantto compare 
loudness differences, we must develop a metric for the difference of two loudnesses at equal fre¬ 
quencies. The first metric is the phon, ameasureof equal loudness. The second metric is the soné, 
a measure of comparad ve loudness. Together, they allow us to accountfor the ear's varying sen- 
sitivity to frequency and intensity. 

6.5.2 The Phon Scale 

The phon scale identifies equal loudnesses across all perceivable frequencies and intensities. It 
consists of a set of equal loudness contours that relate intensity in one región of frequency to the 
intensity required to achieve equal loudness in other regions of frequency. By definition, any 
frequency at the threshold of hearing isexactly 0 phon. 

The phon isdefined asidentical todBSIL at 1000 Hzfrom the threshold of hearing to thelimit 
of hearing. Thus, at 1000 Hz the threshold of hearing, t h = 10- 12 W/m 2 , isdefined asO phon, and 
a level of 120 phons equals 120 dB SIL above the 0 phon reference (recall that dBSIL expresses 
a ratio of two intensities). For example, at 1000 Hz a sinusoid with 10 dBSIL has a loudness of 
10 phons, a 20 dBSIL sinusoid has a loudness of 20 phons, and so on. 

Having defined the phon scale at 1000 Hz asidentical to dB SIL, wenow extend the phon scale 
to frequencies other than 1000 Hz. We do this by comparing sinusoidsatvarious frequencies and 
intensities to a setof reference intensities at 1000 Hz. In general, for a sinusoid with frequency f, 



168 


C hapter 6 


wewanttoknow what i ntensity / is required sothatitwill have the same loudness i. as a si nusoid 
at 1000 Hz. Lete be the criterion of equal loudness. Then for some frequency f and loudness L, 
we wantto solvethe relation / = e(L,f), which tells us what ¡ntensity / is required for a si nusoid 
to achieve loudness L at frequency f. 

Ordinarily, the phon scale is evaluated at 10 phon increments from 0 to 120 phons. An approxi- 
mation to this set of curves is shown in figure 6.9. It shows the contours of equal loudness for sinu- 
soids that were first established experimentan y by Fletcher and Munson (1933). Wesee thatin 
general lowfrequenciesmusthavegreater i ntensity in orderto havethesame loudness asfrequencies 
around 1000 Hz. This isespecially truewhen low frequencies also havelow ¡ntensity. The same is 
also true of high frequencies but with somewhat less exaggeration. The curves in figure 6.9 are 
adapted from those recommended by the International Standards Organization (ISO 226,1987). 

The equal loudness curves shown i n figure 6.9 are also called equal loudness contours because 
they can be thought of as delineating curves of equal elevation above the two-dimensional 
frequency/intensity plañe. Imagine figure 6.9 as a three-dimensional map that we are looking 
straight down on. Greater phon levels rise up toward us likea 3-D relief map. 



Figure 6.9 

Equal loudness contours. (Fletcher and M unson 1933.) 




Psychophysical Basisof Sound 


169 


H ere's a practical appl i catión for the phon scale. Suppose we record a symphony orchestra per- 
forming with intensities between 60 and 95 dBSIL. If we play it back at a lower intensity, say, 
40-75 dB SIL, itsoundstinny, lacking in bass and treble. Reproduced at lower intensity, low and 
highfrequenciesof low intensity receivegreaterattenuation in ourperception becauseof theear's 
lackof sensitivity atthesefrequencies. If we compénsate by boosting the bass and treble according 
to the equal loudness curves, we can restore something likethe original balance of intensities. 
Someaudio amplifiers come equipped with a so-cal led loudness knob that appl i es an approxi ma- 
tion of the above curves for different listening levels. 

Sound level meters approximate the loudness corresponding to the intensity of sound. They usu- 
ally inelude switchable weighting networks, which arefilters applied to the input signal that mimic 
theFletcher-M unson curves, attenuating frequencieswhereourhearing islesssensitive. Inthisway 
the response of theinstrumentcan bemadeto providearough approximation of theperceived loud¬ 
ness of a sound. The meters typi cali y come with A, B, and C weighting networks, which are si m- 
plified i nverse f uncti ons of the 40, 70, and 100 phon curves, respectively (Stevens 1961; ISO 1975). 

6.5.3 Threshold of Hearing 


Perhaps the most sal ient equal I oudness curve is the threshold of heari ng. A n approxi mati on to the 
threshold of hearing is given by Terhardt (1979) as 


T q (f) 


■Müfe r- 6 -° 


, 2 +io tí¡y‘- 


(6.5) 


This complicated-looking function is an approximation of the threshold of hearing for a young 
adultwith acute heari ng. When plotted for f i n the range of human hearing, it produces thegraph 
shown in figure 6.10. This curve can be used to determine the máximum allowable energy level 
for noise and distortion that can be added by a recording system before it is noticed as distortion 



Figure 6.10 

Threshold of hearing. 



170 


C hapter 6 


by a sensitive listener. This has great relevance for the design of audio Systems in general and is 
a crucial metric for perceptual audio coders, such as M PEG audio, including the well-known M P3 
audio coding format. 

A final note on the phon scale: remember that it is a measure of equal loudness. It can only 
answer the question: Is the loudness of two frequencies equal? It does not tell us about loudness 
differences. For instance, a doubling of loudness in phons does not necessarily result in a sound’s 
being heard astwiceasloud.To compare proportional loudness requires the soné scale. 

6.5.4 The Soné Scale 

Wecan characterize the ratio of two sinusoids with different i ntensities at the samefrequency with 
the soné scale. One soné is defined as the loudness of a 1 kHz tone at 40 dB SIL. This is the ref- 
erence loudness of the soné scale. This also meansthat 1 soné =40 phons. A sound that is judged 
to be twice as loud as the reference has a loudness of 2 sones, a sound that is judged to be half as 
I oud as the reference has a I oudness of 0.5 soné, and so on. For example, the average I istener hears 
a 1 kH z sinusoid at 50 dB as about twice as loud asa 1 kH z sinusoid at40 dB. Flence, the 50 dB 
1 kHz sinusoid has a loudness of 2 sones. 

Loudness in sones L s can be related to loudness in phons L p as follows: 

(í. —40)/10 

L s = 2 v Phon/SoneConversión (6.6) 

For 1000 Hz tones, the 'P variable L s relates to the O variable sound intensity roughly following 
a power law: 

L s = kp 0 ' 6 , Sones and Intensity (6.7) 

wherep isthe pressure in pascáis, and kdependson frequency. Thesetwo equations, based on the 
work of Stevens (1956), indícate that loudness doubles for a 10 dB increasein SPL.Thecalibration 
of the soné scale is controversia! because of the difficulty subjects have in identifying loudness 
rati os with certai nty. F or i nstance, Warren (1970) found a doubl i ng of loudness for a 6 dB i ncrease 
in SPL. Thus, (6.7) should not betaken too literally. 

Itshould beclear by now that the reí ationbetweenthe'F and Odomainsisanything but simple. What 
is especial!y remarkable to me is that musicians are able to navigate the complexities of all these non- 
linear, nonorthogonal relations with ease, balancing intonation and loudness to achieve precisely 
cal i brated sonic effects. M ore astonishi ng yet i s that nai ve I i steners are effortl essl y ableto sortitall out. 

6.5.5 Pitch Shift with L oudness C hange 

Another example of the nonorthogonality of 'P variables is that loudness has an impact on pitch. 
If the intensity of a 100 Hz tone is increased from 40 dB to 100 dB SPL, the pitch decreases by 
about 10 percent. At 500 Hz, the pitch changes by about 2 percent for the same ¡ncrease in SPL. 
Try taking headphones on and off while listening to music with the volume as loud as is comfort- 
able. You will probably be ableto hearthe pitch shift as you putthem on and off. 



Psychophysical Basisof Sound 


171 


6.6 Frequency Domain M asking 

When two sinusoidsarepresented simultaneously, thefaintersinusoid can berendered inaudible, 
or masked, by the louder si nusoid if the fai nter one lies within a certain frequency range of the 
louderone. Figure 6.11 shows a 1 kHz 60 dB sinusoid, called the masker, which has the effect of 
rai si ng the threshol d of heari ng i n its vid nity. Theskirtsto either si de of the 1000 H z tone i ndi cate 
thejustnoticeablelevel requi red for a test toneto be audible i nthat range. The dashed li ne i ndicates 
the threshold of heari ng in theabsenceof the masker. For example, a 1.5 kHz sinusoid at 35 dB 
ora 750 H z sinusoid at 35 dB would be inaudible i n this case. Frequencies above the masker fre¬ 
quency are more strongly masked than those below it (Zwicker and Fastl 1990). 

The masker will mask any signal that lies below the masking threshold, not just sinusoids. In 
particular, any artifacts of a recording process, background noise, and so on, will all be masked so 
long as they are below the threshold. 

6.6.1 Temporal Masking 

M aski ng al so occurs w hen tones are pl ayed i n successi on. T hi s i s cal I ed temporal masking. T here 
are three possibilities: 

■ Forward masking Even after a sound ends, its effect on the threshold of hearing lingersfor a 
whi le. The threshold of a test signal following the masker isimpai red foraperiod oftime. Forward 
masking can lastaslong as 100 to 200 ms.Therelativeloudness of the masker and test signal and 
their precise timi ng affect the audibiIity of the test signal. 



Figure 6.11 

Frequency domain masking. 






172 


C hapter 6 


■ Simultaneous masking This occurs when the masker and the test signal are presented at the 
same time, and is identical to frequency domain masking. 

■ Backward masking This occurs when a masker infl uences the audi biI ity of a fai nter test signal 
that precedes it. W hile this might seem atfirst to require prescienceon the partof the ear, it can 
be explai ned by real izi ng that sound percepti on is actual ly i ntegrated over a ti me i nterval precedí ng 
the moment of recogni ti on. T he ti me i nterval i s general I y regarded as bei ng on the order of 200 ms. 
Fai nter sounds lying within this interval aresubjectto somedegreeof masking regardless of their 
order of arrival. Theamountof masking di mi ni shes the more the test signal precedes the masker. 
Itisalso affected by the relative loudness of thetwo signáis. 

Approximatedurationsof forward and backward masking aresuggested by figure6.12 (Zwicker 
and Fastl 1990). The curve indicates the level that a short tone burstmusthavein order to bejust 
noticeablein the presence of a relatively long masker of 200 msduration. They-axisshowsinten- 
si ty I evel expressed i n dB above thejust noticeable I evel of the test si gnal by i tself, that i s, 0 dB 
is the reference intensity of thejust noticeable level of the test signal alone. Thex-axis indicates 
the onset time of the test signal relative to the masker signal, which begins at time 0. 

We can see that the effectiveness of backward maski ng decreases sharply as the test signal i ncreas- 
ingly precedes the masker. Backward masking i s general I y thought tobe effectiveonlyuptoabout5ms. 

W hen musi ci ans are supposed to stri ke a note at the same i nstant, they rarely manage to do so. B ut 
because of temporal masking, our ears are forgiving, perceiving the onset as simultaneous so long as 
the attacks lie within the temporal masking intervals. For instance, if a softer tone attacks up to 5 ms 
earlierthanalouderone,it¡smaskedbythelouderone,sowetendtohearthetwotones as simultaneous. 

Somemodern perceptual coderssuchasM PEG audiodividetheaudiostream into packetsin order 
totransmititmoreefficiently.Temporal masking makesitpossiblefortheaudio packetstostiII sound 
seamless so long as the temporal masking boundaries are not exceeded. To succeed, encoders like 



Ti me relative to masker onset (ms) 


Figure 6.12 

Forward and backward masking. 




Psychophysical Basisof Sound 


173 


M PEG audio mustbeableto provide temporal resolution under 5 msto ensure that events that are 
supposed to be heard as simultaneous are actually perceived that way (Bosi and Goldberg 2003). 

6.7 Beats 

When sinusoids of slightly different pitch f 1 and f 2 are sounded together, the phase difference 
changes through time so that they sometí mes reinforce and sometí mes cancel at a rate of A f= f 2 - f x 
(see volume 2, chapter 2). The amplitude of the sum of thetwo waves modulates at a rate equal 
to the difference between their frequencies. Such slow, periodic fluctuations in amplitude are 
called beats. Figure 6.13 shows the beating that results when two sinusoids with a frequency ratio 
of 91/100 Hz are added together. 

When theear hearstwo puretones of slightly different frequency, the combination produces a 
sensation of audible beats at the difference frequency A f. This is heard as a kind of fluttering or 
wavering of the amplitude of the combined sound. The musical term for this effect is tremolo. 
If Af is greater than about 10 Hz, the tremolo effect disappears and the tone becomes 
rough-sounding and unpleasant, that is, dissonant. If Af keeps growing beyond about 20 Hz, the 
ear starts hearing two distinct, but still rough, tones. As A f keeps growing, the roughness even¬ 
tual I y goesaway somewherenear a majorthird, and we si mplyexperi ence two sepárate tones. This 
effect is best described by reference to the theory of critical bands (see section 6.9). 

6.7.1 Tonal Fusión 

Beats are often used by musicians as an aid in tuning their instruments because thesequalitative 
changessupply additional information along with theear'spitch perception. Figure6.14 provides 
a graphical representadon of tonal fusión and perception of beat frequencies. 



Figure 6.13 






174 


C hapter 6 



Consonant Consonant Consonant 

Figure 6.14 

Tone fusión and the perception of beatfrequencies. (Adapted from Roederer 1993.) 

The beat phenomenon arises from puré tones that are very nearly in unisón, called first-order 
beats. T he ear hears the O effect created by theamplitude envel ope of the two tones and al so expe- 
rienees a 'P effect from neural Processing. 

Beats may al so be heard between puré tones that are very nearly an octave, fifth, orfourth apart. 
These are cal led second-order beats. H owever, this beating results only from the effeets of neural 
Processing. 

W hen frequencies of two si nusoi ds are w ithi n 15 H z of each other, the ear tends to hear j ust one 
pitch. The two sinusoids lose their sepárate perceptual identity, and we hear a single fused pitch. 
Cari Stumpf (1848-1936) studied the circumstances under which tones appear to be fused. He 
defined tonal fusión (tonverschmelzung) as the effect of hearing two tones not as a sum but as a 
whole, or unity (Stumpf 1883/1890). H efound tonal fusi on to be most pronounced i n the consonant 
intervals (unisón, octave, and fifth) and less pronounced in the increasingly dissonant intervals. 

6.7.2 Tonal Fusión and M usic Composition 

Tonal fusi on was evi dently of concern to J. S. B ach. H e sought to compose pl easi ng musi c by usi ng 
consonant intervals, and he wanted to projecta polyphonic musical style, where múltiple inde¬ 
penden musi cal I i nes are separately di scerni bl e. B ut the most consonant i nterval s have a tendeney 
towards tonal fusión, and tonal fusión destroys thesenseof polyphony by making sepárate voices 
appear as one. David Hurón (1991) conducted a stati stical analysis of Bach'smusicandconcluded 
that while Bach preferred consonant intervals, he avoided consonant intervals to the extent that 
they promoted tonal fusión so as notto compromise polyphony. 



Psychophysical Basisof Sound 


175 


Tonal fusión was used explicitly by composer M aurice Ravel in his composition Bolero. The 
repetitive melody of this work is passed around in the key of C to various Instruments over the 
course of its 17 minutes' duration. W hen it's the French horn's turn, Ravel adds a piccolo playing 
the melody transposed strictly atthetwelfth (up an octave and afifth), in the key of G, matching 
the third French horn harmonio. Fie al so adds another piccolo playing the melody transposed 
strictly atthe seventeenth (up two octaves and a third), in the key of E, matching thefifth French 
horn harmonio. The tone color of the piccolos and F rench horn fuse into a single unique hybrid 
timbre (Slonimsky 1948,187-188). 

Ravel may havebeen inspired by the designof mutation stops in French organs. M utation stops 
are ranks of organ pipes that sound at a pitch other than the unisón or octave. When played with 
regular stops, they alter, or muíate, the timbre of the regular stop. Nazard is the French namefor 
a mutation stop that sounds at the twelfth; the tierce sounds at the seventeenth. 

Pitch is only one factor that can cause tones to fuse; spectral content and articuladon (such as 
tremolo, vi brato depth, and rate) are al so factors.John Chowning presented astriking exampleof 
tone fusión based on vibrato in his work on the singing voice (see volume2, chapter 9). 

6.8 Combination Tones 

The violinist, theorist, and composer G iuseppeTartini famously noted in 1754 that when two 
loud, puré tones are sounded together, a third i s sometí mes al so heard atthedifferencefrequency, 
A f = f u - f\, where f u and f, are the upper and lower frequencies. 11 For example, 2100 Hz and 
2000 FHz produce a difference tone of 100 Hz. This effect can be demonstrated easily by having 
two pennywhistle players or soprano recorder players stand near each other playing very 
high-pitched tones very loudly. The players and anyone sufficiently cióse to them will hear 
low-pitched rough tones atthe differencefrequency. This phenomenon is cal led difference tones, 
sometí mes al so Tarti ni ’s tones. 

H el mholtz claimed to have discovered a tone atAf s = f u + f, that he cal led sum tones. M athe- 
matical theory strongly suggests sum tones exist, buttheyaresohard to hear that itisan open ques- 
tion as to whetherthey haveever been perceived experimental!y. We know now that the reason 
sum tones are hard to hear has to do with masking. 

D ifference tones and sum tones are cal I ed general Iy combination tones. 11 i s easy, but i ncorrect, 
to suppose that this phenomenon is related to beats. Beatscannotexplain sum tones becausebeats 
only arise in the difference of two frequencies. Also, the effect of beats dissappears as the two 
frequencies diverge beyond a minor third, whereas for combination tones, Af need not be small to 
be quite audible. Finally, if the tones are presented oneto each ear, beats arestill discerní ble but 
combination tones are not. 

Fl el mholtz (1863, app. 12) conjectured that we hear combi nation tones because of nonl i near pro- 
cessi ng of I oud signal s i n the ear. Fl e supposed that the strength of the tones was forci ng the excur¬ 
si on of thetympanum and other elements of the middleear beyond their región of linear elasticity, 



176 


C hapter 6 


thereby distorting thesound in theear. N onlinear Systems can respond to vibration by generating 
signáis notactually present in thestimuli (seevolume2, chapter9). Thesquareof thesum of two 
signáis, sin 2 (a + ¿>), which isaquadratic nonlinearexpression, includestonesata + d and a - d (see 
volume2, appendix). 

Studies by G ui nan and Peake (1967) have shown that nonl i near effects i n the mi ddl e ear cannot 
by themselves explain combinadon tones. Current theory favors an effect within the cochlea for 
combinad on tones, although dynamic feedback pathsfrom the auditory cortex may explain some 
other distortion producís that have been observed (B. M oore 1997). There is still a high level of 
theoretical ambiguity in thissubject. 

6.9 Critical Bands 

Fletcher (1940) unified many of the phenomena described in the sections on frequency domain 
masking, temporal masking, and beats with a conceptthat hecalled critical bands. Thesecan be 
thoughtof aschannelsoffrequency-selective psychoacoustic Processing thataffectourperception 
of pitch, loudness, and maski ng of frequency components lyi ng withi n a critical frequency distance 
of one another. This insight eventually led to the psychoacoustic encoding of sound and theintro- 
duction of the MPEG audio encoding standard. M PEG takes advantage of the effects that critical 
bands haveon hearing. 

6.9.1 C ritical Bands and L oudness 

Zwicker and Feldtkeller (1955) provided an elegant demonstration of critical bands based on a 
loudness effect. They played anarrowband noise signal containing all frequenciesbetween 980 to 
1020 H z. Thebandwidth of the signal was40 H z, and its band centerwas 1000 H z (figure 6.15a). 
Then, keeping the band center at 1000 H z, and keeping the total intensity constant, they gradual Iy 


I 

(T 6) dB 



(a 

Constant intensity with 
increasing bandwidth 

(1-12) dB T 


□ 



(u 


__£l 

n 


j 

1000 Hz 



(C 

Frequency 



¿40 Hz 

±80 hT 

±160Hz 





Figure 6.15 

Critical bands and loudness. 



Psychophysical Basisof Sound 


177 



increased the bandwidth, spreading the same energy over a larger and larger frequency range 
(figures 6.15b and 6.15c). One might expect that each of these signáis would be heard as equally 
loud because each contai ns the same total energy. And, indeed, subjects reported that the loudness 
remained constant... but only up to a certain bandwidth, after which perceived loudness began 
to increase even though there was no increase in total energy. With the band center at 1000 Hz, 
loudness began to increase when the bandwidth exceeded about 160 Hz. 

So when the noise bandwidth was kept narrower than a critical threshold (160 Hz bandwidth 
at 1000 Hz band center frequency), the researchers got the expected effect: subjects reported 
that noise bands of varying bandwidth and constant intensity all sounded equally loud. 
But when bandwidth exceeded the critical threshold, subjects reported increasing loudness, 
even though total energy in the noise spectrum remained constant. Figure 6.16 shows how 
they observed loudness to increase after the bandwidth of the noise grew beyond thewidth of 
a critical band. 

To expl ai n thi s effect, Zw i cker and F el dtkel I er theori zed that the ear I umps together the I oudness 
of components that lie within the same critical band. Loudness increases when significant energy 
spillsinto more than one critical band. Thus, within the critical band, loudness is a functionof the 
spectral width and spectral intensity. But once the bandwidth of the noise is broader than a critical 
threshold, all that matters i s the spectral intensity. In orderto understand thiseffect, itisnecessary 
to do anotherexperimentwith masking. 

6.9.2 C ritical Bands and M asking 

Supposel play a sinusoid with frequency f s and awideband noise source with a band center f c that 
is distant in frequency. You adjust the loudness of the sinusoid so thatyou can just barely hear it 
over the noise. Let'scall thisyour just noticeable loudness threshold, T 0 (figure 6.17a). Now, keep- 
ing its spectral amplitude the same, if I movethe noise signal’s center frequency so itis the same 
as the sinusoid (figure 6.17b), the noise masks the sinusoid, and you can no longer hear it. Now 



178 


C hapter 6 


a) 

Sinusoid Afn b) ^ 

T 0 


1 1 >• 






1_ H . I 


^: 


Figure 6.17 

Sinusoid with wideband noisesignal. 

supposel allow you to raisethelevel ofthe sinusoid to some level T soyou can hear it abovethe 
noise. Thedifferencebetween thethresholds, AT =T - T 0 , istheamountby which the noisesignal 
masksthe sinusoid. 

Keeping its amplitude the same, if I now increase the bandwidth A f„ of the noise, the sinusoid 
will again become inaudible. You must make the sinusoid even louder (by increasing T) before 
you can hear it again. Therefore the amount of masking AT increases as the bandwidth of the 
masking signal increases. 

However— and this is the interesting part— beyond a critical threshold increases in the noise 
bandwidth A f n no longer increase the amount of masking. N o further increases in T are requi red, 
no matter how much broader the bandwidth of the noise signal becomes. 

Fletcher, whose experiment this is, suspected that this effect occurred because of a neurophys- 
iological structure in the ear. H e suggested that areas of the basilar membrane responded together 
to selected frequency ranges, the critical bands. The bandwidth of the noise signal where it ceased 
to further i ncrease the j ust noticeabl e I oudness threshold of the si nusoi d was taken as the wi dth of 
a critical band, centered on that frequency. 

6.9.3 C ritical Bands and Pitch 

Plomp and Levelt (1965) and Greenwood (1961a) suggested that a pitch-based relation exists 
between consonance and critical bands. They bel i eved that pitches of sinusoids separated by less 
than a cri ti cal band gi ve ri se to the effects descri bed i n the secti onon beats, including tonal fusión, 
whereas sinusoids that are separated enough to resol ve i nto two distinct critical band regions on 
the basilar membrane give rise to the perception of two distinct tones. Experiments showed that 
sinusoids appear most dissonant at approximately 40 percentofa critical band. Sinusoids both 
closer and farther than that pitch distance become less dissonant. The dissonant sensation does not 
occurin the región oftonefusion and alsodoesnotoccur where thedifferencein frequency exceeds 
the width of a critical band. 



Psychophysical Basisof Sound 


179 


6.9.4 M P3 and C ritical Bands 

MP3, a component of theM PEG standard (seevolume2, chapterlO), is a practical application of 
masking dueto critical bands. M P3 is an extensión of thetechnology for encoding digital audio, 
knownaspulse-codemodulation, orPCM (seevolume2,chapterl). M P3 has created a revolution 
in music distribution because audio can betransmitted and stored much moreefficiently in M P3 
formatthan with PCM while maintaining satisfactory sound quality. 

BothM P3and PCM encoding relyonapsychoacoustical model—asetofjudgmentsaboutwhat 
we can and can't hear—to determine what information in the signal should be encoded. PCM 
encoding uses a relatively weak psychoacoustical model: 

■ Frequencies above 22.5 kHz are not encoded because they are above the range of human 
hearing. 

■ Soundslouderorsofterthan certain limits are also not encoded. 

TheM P3 psychoacoustical model inheritsthePCM model. (I n fact, the i nputto an M P3 encoder 
isa PCM-encoded audio signal.) ButtheM P3 psychoacoustical model also includescriteria about 
human temporal and spectral masking, and so ¡t can encode more efficiently than PCM. 

M P3 encoding takes place in two principal stages: 

1. A psychoacoustical model of the critical bands identifies irrelevant frequency components— 
those that would not be perceived because of temporal and spectral masking effects. M asked com¬ 
ponents are encoded with I ess detai I than i s empl oyed for unmasked components, thereby si mpl ifyi ng 
the spectrum of the encoded signal. A Ithough si mpl ifyi ng the encoding of the masked components 
distorts them, the distortion isn't noticeable because the components are masked. (The amount of 
si mplificatión, and henee the amount of distortion, must beconstantly monitored and adjusted to be 
sure that the distortion introduced by this process never exceeds the maski ng threshold.) 

2. Thesimplified spectrum isthen subjected to additional stepsto removeredundant ¡nformation 
in the signal and put it in themost compressed representad on possible (seesection 9.15). 

The result is an encoding of sound that can betransmitted with lesseffortor stored in less space 
than isrequired forPCM. Savingsof between 12 and 20 ti mes that requi red for P C M audio are pos¬ 
si ble withoutsubstanti al degradation of sound quality. 

In orderto recreate the audio signal, an M P3 decoder is requi red. The M P3 decoder restores the 
simplified spectrum from the compressed representad on encoded by step 2 above. However, the 
decoder cannot reverse the simplified encoding of the masked components in step 1 because that 
i nformation was discarded. B ecause M P3 can't recover exactly the signal presented to its encoder, 
it is a lossy encoding or lossy compression scheme. 

Strictly speaking, only the simplification of masked components in step 1 is lossy. The redun¬ 
dant ¡nformation removed in step 2 is recovered completely in the decoding stage, so step 2 
performs lossless encoding or lossless compression. Technically, PCM audio encoding is also 



180 


C hapter 6 



2000 4000 6000 8000 10000 12000 

f -- 


Figure 6.18 

Critical bandwidth vs. center frequency. 

lossy: becauseofthefrequency and amplitude limitsit imposes, itdoesn't recoverexactly thesig- 
nal presented to itsencoder. However, becausethereis no simplification of masked componente, 
PCM is less lossy than M P3. 

6.9.5 M easuring C ritical Bands 

Although critical bandwidth estimatesvary substantially dependíng uponthetypeof experiment 
used to measure them, they average about onethird of an octavefor most of the audible range 
(but are greater at low frequencies). A semitone interval is u j2, so onethird of an octave would 
be 1/2 = 1.26 = 5/4, or si i ghtly over a maj or thi rd. T hus, the ear appears to take i nto account the 
stimulus of neurons as far away as onethird of an octave in order to determine the loudness of 
a sound. 

Figure 6.18 shows how critical bandwidth varíes with frequency for a typical Iistener (Zwicker, 
Flottorp, and Stevens 1957). The dashed line indicates onethird of an octave. For instance, at 
10 kHz, onethird of an octave is between 2 and 2.5 kHz. 

Critical bandwidth remainsfairly constant up to about 500 Hz, then grows by about 20 percent 
of frequency thereafter. A reasonableapproximationof the critical bandwidth isgiven by Zwicker 
and Fastl (1990) as 

r -.0.69 

8W c (0 = 25 + 75 l + 1.4(j^H Hz, Critical Bandwidth ( 6.8) 

where 6 l/l/ c ( f) is the critical bandwidth at frequency f. 

Although the critical bands are continuous, it is sometí mes useful to think of the ear as com- 
prising a discrete setof bandpass filters that obey (6.8). Using this approach, it is common to 
divide the ear'sspectrum i nto 24 discrete critical bands, asshownintable6.1 (Zwicker 1961). This 



Psychophysical Basisof Sound 


181 


Table6.1 

Bark Scale: Critical Bandwidth, Center Frequency, and Critical Band Rate 


Bark 

No. 

Center 

Frequency 

Critical 

Bandwidth 

Lower 

Band Edge 

Bark 

No. 

Center 

Frequency 

Critical 

Bandwidth 

Lower 

Band Edge 

0 

50 

80 

20 

13 

2,150 

320 

2,000 

1 

150 

100 

100 

14 

2,500 

380 

2,320 

2 

250 

100 

200 

15 

2,900 

450 

2,700 

3 

350 

100 

300 

16 

3,400 

550 

3,150 

4 

450 

110 

400 

17 

4,000 

700 

3,700 

5 

570 

120 

510 

18 

4,800 

900 

4,400 

6 

700 

140 

630 

19 

5,800 

1,100 

5,300 

7 

840 

150 

770 

20 

7,000 

1,300 

6,400 

8 

1,000 

160 

920 

21 

8,500 

1,800 

7,700 

9 

1,170 

190 

1,080 

22 

10,500 

2,500 

9,500 

10 

1,370 

210 

1,270 

23 

13,500 

3,500 

12,000 

11 

1,600 

240 

1,480 

24 

19,500 


15,500 

12 

1,850 

280 

1,720 







Figure 6.19 

Quality factor of critical bands. 

numbered list of discrete critical bands is the bark scale. The bark scale encodes the center fre¬ 
quency and bandwidth of each numbered critical band. 

6.9.6 Quality Factor of C ritical Bands 

Table6.1 shows that the bandwidth of the critical bands increases in relatively constant proportion 
as the center frequency of the band increases. The ratio of the center frequency to the bandwidth 
of a bandpassfilter is its quality factor, often abbreviated Q. Figure 6.19 shows the Q for each of 



182 


C hapter 6 



Figure 6.20 

Frequency to bark function. 


the critical bands. The narrower the bandwidth, the higher the Q, and the stronger will be its res- 
onancewhen stimulated by asignal whosefrequency lies within theband (seesection 8.9.6). Like 
the critical bands, a constantQ filter varíes its bandwidth as a function of the center frequency, 
keeping a constant ratio between them. Note in figure 6.19 that the Q of most bands is fairly 
constant i n the range from 4 to 6, especi al ly i n the center range of heari ng. Thus, the criti cal bands 
can beviewed as similar to constantQ bandpassfilters. 

The center frequencies and bandwidths in table 6.1 are only samples of the continuous 
frequency response of the ear. I n reality, the auditory effects of critical bands areformed around 
the frequencies of the signáis the earhears and arenot associated with a specific fixed filter bank 
in the ear. 

The bark number for a frequency in Hz can beobtained with thefollowing equation (Zwicker and 
Fastl 1990): 

z(f) = 13atan(0.00076f)+3.5atan(j^j . Bark Number ( 6.9) 

Figure 6.20 shows (6.9) plotted for the range of 20 Hz to 20 kHz. 

6.9.7 C ritical Bandwidth and Pitch J ND 

The curve of critical bandwidth vs. center frequency in figure 6.18 is very cióse to the same shape 
over the same range as thecurve for pitch J N D infigure6.4. Whilethe pitch J N D spreadsfrom about 
3 to 30 Hz in 5 kHz, critical bandwidth goesfrom about 100 to 900 Hz over the same range. Thus, 
critical bands are proportionally about 30 times wider than the pitch J N D at the same frequency. 

6.10 Duration 

Flow long doesittakeforusto identify the pitch of a tone? Flow long doesittaketo determine the 
loudness of asound? 



Psychophysical Basisof Sound 


183 


6.10.1 Effectof Duration on Pitch 

Toneswithquickonset times (suchas vi braphoneor marimba) have a el icki ng or percussiveattack 
that is essentially a broadband noise. Very brief tones (under 10 ms) sound like clicks no matter 
what timbre the tone has. As the tone lengthens beyond about 15 ms, and if the tone's onset 
becomes more gradual, pitch perception solidifies up to about 30 ms. Pitch perception becomes 
stronger as the tone continúes growing in length regardless of onset time. 

Tones of greater complexity do not necessarily take longer to recognize. The ear can identify 
many pitches simultaneously in nearly constantratetime. W hileitcan be shown that pitch depends 
on duration, this dependenceisfor extremely shorttones only. This allows us, for instance, to fol- 
Iow extremely rapid polyphonicmusical passageswithsufficientaccuracy to enjoy theexperience. 
M usicasweknow it would be radical ly different if pitch weresubstantially dependent on duration. 

U nder optimal conditions we establish a sensation of pitch about 4-8 eyeles after tone onset. The 
conventional wisdomisthattheattack noise maskstheunderlying peri odi c vi brati on of the i nstrument 
and that this masking delaysour pitch recognition. However, I have a different experienceto report. 

I once developed a pitch-tracking Computer system for an electronic violin built by M ax 
M athews. While developing the system, I spent many hours Iistening to, and analyzing, violin 
tones. I observed thatl was generally ableto identify the correct pitch well beforeany significant 
periodic Information was available in the signal. This suggested that my ear was using the 
characteristic broadband noise in the violin's attack transí ent to help identify the pitch. Spectral 
and temporal qualities of the attack noise may provide additional early clues to the correct pitch, 
perhaps through a cognitive learned response. 

6.10.2 C ritical Bands and Acoustical U ncertainty 

Generally, the more precise we wish to be about the exactfrequency of a sound weare hearing, 
the longer wemust listen toit. Let’ssay that the smallestfrequency differencewe can discrimínate 
is A f=f- f 0 and that the required duration over which we must listen in order to resolve this 
frequeney difference is At=f-1 0 . Then we can say that the acoustical uncertainty is 

AfAt-k, Acoustical Uncertainty (6.10) 

wherek is a constant that relates achíevable frequeney resolution to required temporal resolution, 
and viceversa. 

U nder optimum conditionsk» 0.1 for the auditory system (M ajernick and K aluzny 1979). This 
means that to achieve frequeney resol uti on of 0.1 H z, our ears requi re about 1 second of the stim- 
ulus under optimum conditions. If all we have is 0.1 second of sti mui us, our ears can achieve fre¬ 
queney resolution of about 1 Hz under optimum conditions. To achíevefiner frequeney resolution 
(smallerAf) requi res a correspondí ngly larger ti me interval (larger At).Thus, k representa the fun¬ 
damental limiton our ability to know the precise frequeney of a signal within a precisetimei nterval. 

Uncertainty plays many important roles in the mathematics of music (see section 9.15, equa- 
tion (9.19); and volume 2, chapters 3 and 10). 



184 


C hapter 6 


Clearly, we want our ears to have the fi nest frequency precisión possi ble (small A f) with the 
shortest response time possible (small A t). But the basilar membrane is governed by (6.10). So 
increasing its frequency precisión would necessarily require us to lessen its temporal precisión. 
However, our hearing neatly sidesteps the limitations of (6.10) by using critical bands. The ear 
divi des up the audio spectrum i nto frequency bands each of which has a reí atively broad A f. H ence, 
within each band, Afean berelatively small. Thatway wegetgood frequency resolution without 
suffering poor temporal resolution. 

The trade-off is that the critical bands provide relatively poor pitch discrimination by them- 
selves. That the pitch JND is about 30 ti mes fi ner than the width of a critical band suggests just 
how greatly aided we are by critical bands and the dynamic feedback processes, described in the 
section on frequency sharpening, that refine our sense of pitch within a critical band. 

6.10.3 Loudnessand Duration 

Theacoustical uncertai nty principleapplies for loudness as well. The ear averages over a duration 
of about 200 ms to determi ne the loudness of a sound. B ecause of this fact, sounds that are shorter 
than 200 ms must be proportionately more i ntense to appear to have the same loudness as sounds 
that are longer than thisthreshold. Putanotherway, loudness is proportional to duration upto about 
200 ms. M ore precisely, loudness grows by 10 dB as duration grows by a factor of 10, up to 
200 ms. This correlation is even stronger for broadband sounds and extends up to about 1 second. 
An important consequence of these faets is that our ears lack a means to protectthemselves nat¬ 
ural ly against impulsive high-intensity sounds, such as gunfi re (see the section on acoustic reflex 
and temporary threshold shift). 

6.11 C onsonance and D issonance 

Consonancewasdefined in chapter 3 as tones that sound well together. Butwhatprocessgoverns 
our percepti on of consonance? T here are many theori es of consonance. T hey general I y fal I i nto one 
ormore of thefollowing categories: 

■ Cultural theories examine social, cultural, and stylistic norms. 

■ Acoustic theories look atthephysical propertiesof acoustical signáis, such as propertiesof musical 
instruments and scale Systems. 

■ Psychophysical theories look at how the neurophysical structure of the ear may affect 
consonance. 

■ Cognitive theories examine learning, expectation, and categorical percepti on. 

As an example of a cognitive theory, a dissonant sound may be heard as consonant if it 
is preceded by many sounds that are even more di ssonant. While learning how to write 
sixteenth-century chórale harmonizations in the style of J. S. Bach, I experienced a shift in my 
expectation of consonance as my ear acclimated to this antigüe style. I carne to appreciate why 



Psychophysical Basisof Sound 


hi s contemporari es found some of B ach’s choral e setti ngs shocki ng, whereas to a modern I istener 
they can seem bland. 

Theoriesof consonancestretch back at leastto Galileo Galilei. Plomp and Levelt (1965) quote 
Galileo (1638) as follows: "Agreeableconsonances are pairs of tones which striketheear with a 
certain regularity; this regularity consists in the fact that the pulses delivered by thetwo tones, in 
the same i nterval of ti me, shal I be commensurabl e i n number, so as not to keep the ear drum i n per¬ 
petual torment." The relatively large number of extant theories of consonance goes far beyond 
whatcan besummarized here. Instead, I devel opa simple psychophysical model based on critical 
bands to give an ¡dea of the subject. 

If two si ne tones of 1000 H z and 1015 H z are pl ayed, we do not hear two disti nct tones. I nstead, 
wetend to hear onefused pitch with a 15 Hz beatfrequency and attendant roughness (seesec- 
tion 6.7). This is because the critical bandwidth for 1000 Hz is about 160 Hz, and thetwo tones 
lie cióse within the same critical band. If we let "dissonance" define this roughness, then "conso¬ 
nance" defines its absence. In terms of figure 6.14, 

■ Frequenciesdiffering by lessthan aJND of pitchform a perfectconsonance, or unisón. 

■ Frequencies differing by morethan a critical band form a consonance. 

■ Frequencies differing by between 5 percent and 50 percent of a critical band are the most 
dissonant. 

Greenwood (1961b) wasthefirstto observe a reíation between critical bandwidth andjudg- 
ments of consonance and dissonance. He analyzed data col lected by M ayer i n 1894 and compared 
itto theestimated size of critical bands. Sounding intervals with tuning forks, M ayer had asked 
listenersto identify the smallest i nterval for which no dissonance wasperceived.Greenwood’splot 
of M ayer's data suggested that the dissonance disappears when the distance between puré tones 
is greater than or equal to the size of a critical band. Linking dissonance to position within critical 
bands is called tonotopic dissonance. 

We can expand this observad on into a simple psychophysical metric for the consonance or 
dissonance of any two complex tones by counting how many of their partíais Iand together in 
critical bands (and discounting any that lie within a J ND or do not share a common critical 
band). The hypothesis is that the more the partíais of thetwo tones fall within the 5 percent 
to 50 percent critical band range, the more dissonant the two tones should be (Plomp and 
Levelt 1965). 

The fol lowi ng ¡san illustration of this approach. 

1. Start with two complex tones that form a musical ¡nterval, say, a perfect fourth. 

2. Count the number of dissonant partí ais d. For each pardal of the lower tone p¡ ,count how many 
partíais of the upper tone p,j forman i nterval that is withi na critical band of p,. Useequahon (6.8) 
to compute the critical bandwidth for each harmonic. If theinterval issmall enoughtofall within 
the same pitch J N D, exelude it from the count of dissonances because it is perceived as a unisón 
and henceisconsonant. 



186 


C hapter 6 


3. As wego up in frequency, at some point thesuccessive harmonios of each tone will begin to fall 
within thesamecritical band becausethecritical bands widen with increasing frequency butthehar- 
monies do not. Stop counti ng when the harmonies of either tone by themsel ves begi n to fal I withi n one 
cri ti cal band, because if the ear uses a method I i ke thi s, we presume i t woul d have to stop here as wel I. 

Figure 6.21 shows the count of dissonances for p, =220 Hz with p u set in turn to the 12 
equal-tempered semitone intervalsfrom unisón to one octave above p /( based on the critical band 
function given in equation (6.8). 

If we Iist the intervals from figure 6.21 in order of increasing dissonance, the results are as shown 
in table 6.2. The order of the shaded intervals agrees with the ones in table 3.5, which is ordered 
according to standard Western cultural norms of musical consonance. So this approach is fine up to 
a point. The rest of the orderings in table 6.2 are arguable. The mi ñor seventh and major second are 
predicted to be more consonant than the major third, for example. A Iso, it does not seem right that 
the tritone should have the same consonance as the major third. This result agrees with Terhardt 
(1974), who wrote that consonance is "only slightly and indirectly correlated with musical intervals. 
Thus, psychoacoustic consonance cannot be consi dered as the basisof the senseof musical intervals." 

W hile better computad onal esti mates of consonance are avai lable (e.g., Kameoka and Kuriya- 
gawa 1969a; 1969b), the sheer number of competing theories of consonance extant in the world 
today suggests that the only way forward is to perform (sigh) more research. 



Figure 6.21 

Dissonance metric of equal-tempered intervals based on critical bands. 


Table 6.2 

Intervals in Increasing Order of Dissonance 

lnterval I f £ | él ¡i % H 8 I £ R S 

doíi£¿££¿¿é¿ £ £ 

Pitch class 0 12 7 5 9 10 8 2 4 6 11 3 1 

Dissonance 0 0 46 6|67 7 8 8 8 11 15 




Psychophysical Basisof Sound 


187 


6.12 Localization 

How isitthatwecan so easily tell which di rection a sound iscoming from? In placing a sound in 
space, we extract psychophysical cues from arriving sounds based on 

■ The geometry of the outer ear and the placement, size, and shape of the pinnae and ear canal 

■ The geometry and orientation of the head, chest, and shoulders 

■ The distance of the ears above the ground 

We add to these psychophysical cues a cognitive framework that ineludes 

■ U nderstandings about the acoustical properties of the sound source 

■ Basic acoustical faets about sound transmission in air 

■ Information about the known acoustical environment 

■ I nformation about our ori entation in space in six degrees (up/down, left/right, forward/backward, 
pitch, yaw, and roll) 

Incredibly, for each sound source in the environment, our hearing automatically and instanta- 
neously creates a psychological i mage of the sound with its di rection and distance encoded so that 
weregisteritsubjectively asan objectinspace/time,togetherwiththenatureoftheacoustical envi¬ 
ronment that it lies within. Pretty amazing. But that's not all. We can also tell 

■ W hether the sound i s comi ng from above or bel ow 

■ Itsrateof relative motion 

■ Itsrateof relativeacceleration 
and much, much more. 

6.12.1 Angular Cues 

Angular cues tell the direction of a sound on the horizontal plañe. John Strutt (1907), the third 
Lord Rayleigh, a pi oneer i n spati al heari ng research, theori zed about the cues the ear uses to deter- 
minetheangleof an incident sound. He began by noting that if a sound source is located to one 
side of the receiver, the sound energy received at the closer ( ¡psilateral) ear will be more intense 
than at the further ( contralateral) ear because sound must travel a longer distance to reach the 
contralateral ear, and intensity decreases as the square of distance. 

6.12.2 Interaural Level Difference 

H e al so noted that sound travel i ng to the contralateral ear must navi gate around the head. H e knew 
that high frequencies are attenuated relatively more than low frequencies when they diffract 
around an object (see section 7.11). The sound heard at the ¡psilateral ear will be brighter than at 
the contralateral ear because the head shadows the contralateral ear. 



C hapter 6 


Rayleigh reasoned thatby comparingthedifferencein intensity level, especially of high-frequency 
sounds received by the ears, our hearing should be ableto tell the direction of the sound. Rayleigh 
grouped the intensity cueand thediffraction cuetogether and called them jointly the/nteraura/ /eve/ 
difference (ILD). 

Interaural level difference is small for wavelengths less than about four times the diameter of 
the human head (averaging approximately 17 cm). So this cue shouldn't work for frequencies 
below about 500 Hz. Butdiffraction by the head increases rapidly with increasing frequency, and 
above about 3000 H z, R ay I ei gh fi gured that head shadow i ng shoul d cause a 20- 30 dB drop i n I evel 
at the contralateral ear, making this a very effective angular cue in this frequency range. 

Rayleigh realized that his theory implied that directional sensitivity should vanish for sounds 
that contain no energy above about 500 Hz. Butwhen heexperimented on this, hewas surprised 
to discoverthat hecould determine the direction of puretoneseven aslow as 128 Hz. So hewent 
backto the oíd drawing board. 

6.12.3 I nteraural Time Difference 

Rayleigh then consi dered the possi bility that hearing is sensitiveto the difference i n phasebetween 
signáis arriving at each ear because of the greater time that must elapse for the signal to arrive at 
the contralateral ear. Rayleigh called this the interaural time difference (ITD) (Strutt 1907). He 
used a simplified trigonometric approximation of head shapeto calcúlate the differentlengths of 
the sound paths to the two ears. His first simplification was to model the head as a sphere with 
radiusr. Next, he consi dered only plañe waves, whose rays arrive at the ears i n paral leí. Hechar- 
acterized the direction of sound arrival, the angie ofazimuth, asfollows. He drew a radius from 
the center of the head forward through the nose as the zero-degree reference (figure 6.22a). He 
drew anotherradi us at the angie of the incident plañe wave. The angie of these two radii is the azi- 
muth anglez. With this simplified model the difference in thelength of thestraight-line path to the 
two ears in terms of azimuth is 2r sinz. For example, if z = 0 o , sinO = 0 and there is no del ay dif¬ 
ference for sounds arri vi ng di rectly f rom the front (or back). If, however, z = 90°, then si n90° = 1, 



Figure 6.22 

ITD, spherical head. 



Psychophysical Basisof Sound 


and the sound (which is now coming at the head di rectly from the leftside) experi enees a del ay 
of approximately the head diameter to reach the other ear. If we assume sound travels at speed c, 
this corresponds to an ITD of 2r sin (z)/c. 

There are obvious difficulties with this analysis, including the fact that it calculates the sound 
as travel i ng through the head. F i gure 6.22a, shows an i ncident sound ray travel i ng through the head 
to arri ve at the contral ateral ear. Sti 11, for smal I angl es of azi muth, thi s i s not a bad approxi mati on. 
Wecan improve itslightly as in figure 6.22b by calculating the del ay to thefar ear along a ray that 
arrives at a tangent point on thesideof the head (linelength d : ), then ares around by diffraction to 
theear (are length d 2 ). With equation (5.2) for are length, thepath length will bed 1 + d 2 = rsinz + rz. 
Convertí ng path length to ITD, we have 

ITD _ r sin z + rz , ;|| < 90 °. Interaural Time Difference (ITD) (6.11) 

Equation (6.11) only works for azimuth angles whose magnitude is less than 90° because 
beyond 90° the rays arri vi ng from behind the head would be el oser to the source, and as the angle 
approached 180°, thearriving sound rays would be time-aligned again from behind the head. 

W hile (6.11) isan improvement, itisstill notagood estímate for the large el assof people whose 
heads are not preci sel y spheri cal. AI so, most peopl e’s ears are not on a di ameter through the center 
of the head but are somewhat back from it. 

Hearing is al so sensitive to the onset of sounds, and we al so use the onset time difference 
between the ears, the lateral onset cue, to help establish direction of arrival. 

6.12.4 Problems with ITD 

Theear isonly sensitive to ITD for frequencies whose wavelength is less than half thedistance 
between the ears because above this frequeney the effect becomes ambiguous. To see why, let's 
assume the diameter of the average head to be d = 0.175 m. When exactly half a wavelength 
spans the distance between the ears (X = 2d), the ear registers the same pressure differences at 
both ears regardless of which direction the sound is coming from. If our ears were sensitive to 
ITD for waves in the range d<Á.<2d,wewould hear an apparent source location on the opposi te 
si de from the true di recti on of arrival. Perhaps anti ci pati ng the potenti al for adapti ve catastrophe 
here, our evolutionary intelligence wisely bred this capability out of us. Taking the speed of 
sound at standard temperature and pressure tobe 340 m/s, the frequeney correspondí ngtoX = 2d 
is (340/0.175)/2 = 971 Hz, which, notsurprisingly, isaboutwhere our ears stop paying attention 
to ITD cues. Our hearing is most sensitiveto ITD around 500 Hz. A t that frequeney, experi ments 
show we have a J N D of azi muth Az near the forward di recti on of between I o and 2 o . U si ng (6.11) 
to calcúlate the ITD for Az = 2 o yields the astonishingly small figure of 18 microseconds. Given 
the comparatively sluggish synaptic del ay time of about 1 millisecond for average neurons, it 
seems incredible that our ears are capable of measuring such small delays with such precisión, 
but they do. 



190 


C hapter 6 


6.12.5 Dúplex Theory 

ILD and ITD togetherareknown as the dúplex theory becausethesetwo effectsarecomplemen- 
tary. Our hearing is responsiveto ILD cuesfrom 500 Hz upward becoming reliably strong above 
about 3000 H z, so IL D cues are best for high f requenci es. O n the other hand, IT D cues are strongest 
below about 1000 Hz. F or f requenci es around 2 kHz, whereneithercue works well, our local ¡za¬ 
hón i s not very good (Stevens and New man 1936). 

6.12.6 A natomical Transfer F unction 

A major difficulty of the dúplex theory of ILD and ITD cues is that it implies there should be regions 
where we experience front/back and top/bottom source direction ambiguity. The most obvious case 
where the theory predictsthis should occurisforsoundson the median plañe (seefigure 6.24). Sounds 
arriving from any position on the median plañe have ITD and ILD of 0 becausethey are equally cen- 
tered between the ears. Therefore, we should have no dueastotheelevation of asound. Butwecan 
clearly distinguish sounds from above and below. Furthermore, ¡dentical ILD and ITD cues are sup- 
posedtobeproducedby sounds at positions a, b, c, and d in figure 6.23. Theduplex theory implies that 
the región of ambiguity forms the surface of a cone of confusión whose apex is the ear. There should 
be as many cones of confusión as there are ITD/ILD cues. But the ear, being ignorant of this scientific 
diffi culty, is quite able to distinguish thesecues. So something is missing from the dúplex theory. 

Researchers eventual ly noticed the large flaps of ski n (pi nnae) that stick out from the si des of our 
heads. They discovered that our hearing is very sensitiveto the way thespectrum of arriving sounds 
ismodified by thesound shadowing and scattering effectsof thepinnaeaswell asthe head, shoulders, 
and torso. A11 these parts of the body cause sounds comi ng from different di recti ons to be fi I tered dif- 
ferently on their way to the eardrum in a manner that is highly predictable by our hearing (after all, 
the shape of our bodies is well known to us). These di recti on-dependent cues, variously called the 
anatomical transfer function (ATF) or the head-related transfer function (HRTF), aretheessential 
cues for discriminating front/back and elevation of sound sources and also play a role in discrimi- 
nati ng lateral cues. Thi nk about it. W hy are our pi nnae al ways behind our ear canals? So that we can 



Figure 6.23 

Coneof confusión. 



Psychophysical Basisof Sound 


191 


tell sounds in frontfrom soundsin back. Sounds arriving from behind thehead aresubjectto more 
diffraction than sounds coming from in front because the pinnae block the direct path for signáis 
behind the head. Similarly, we can tell up from down by diffraction effects caused by our anatomy. 
Thus, spectral modifications caused by ATF are important cues to lócate soundsin space. 

6.13 Externalization 

ATF al so solves another problem. If I play you a stereo signal through loudspeakers, you hear the 
sound coming from the general direction of thespeakers, that is, outside your head. But if I play 
you the same sound over headphones, you generally experience the sound inside your head. 
W hat's the difference between these two presentation modes? Y our ATF. Fl eadphones bypass the 
filtering applied by your head, pinnae, shoulders, and torso to incoming signáis, depriving you of 
a sense of direction for arriving sounds, and your hearing seems to conclude that the sound there- 
fore must be coming from insi de your head. If I simulated your ATF cues by filtering the sounds 
I send to your headphones based on measurements of your ATF, you woul d hear the sounds outside 
your head again. 

6.13.1 Measuring ATF 

I could determine your own particular ATF experimentan y as follows. First, I ask youto sitina 
stabl e posi ti on and (very carefully) inserttiny microphones into yourear canals. W hileyou sitstill, 

I beam a elick or short noise burstfrom a smal11oudspeaker at your head from a constant distance 
and from all angles of azimuth z and elevation 0 around you. I record the signáis received by 
the microphones in your ear canals, which show how the shape of the waveform is changed by the 
scattering/shadowing properties of your body on its way to your ears. I then apply well-known 
signal-processing techniquesto therecordings (seevolume2, chapter 3), resulting in asetof spectra 
describing your ATF for all measured angles of azimuth and elevation. 

With this Information, I can giveyou the ¡Ilusión that you are hearing a sound coming from any 
azimuth zand elevation <|> of my choice. All I haveto do is selectthespectrum corresponding to 
the direction I want, and filter a recorded sound with thatspectrum and play it for you over head¬ 
phones. The ¡Ilusión even works with loudspeakers under favorable conditions. 

The ¡Ilusión even works pretty well if I substitutesomeoneelse'sATF for yours. These ¡deas 
are the technological foundation for better-quality 3-D audio surround Systems. They are able to 
skipthetedious partof having to measureeach individual'sATF by using theATFsof peoplewho 
are good ATF "donors," that is, whose own response patterns are characteristic of those of many 
other individuáis. 

6.13.2 Head Movement and Spectral Context 

Therearestill unresolved issues, even with ATF theory. I said that we use spectral modifications 
caused byATF to help lócate sounds. But how can weknow apriori whether the spectral features 
of a signal wehear aredueto ATF-i nduced filtering or arejust built-i n aspectsof the source si gnal's 



192 


C hapter 6 



original spectrum? In fact, studies show we really can'ttell the difference, that is, unless wecan 
turn our heads, which immediately clarifies whether a sound is realIy coming from that direction 
or whether it just happens to match a positional spectral cue. If the source al so moves, our dis- 
crimination improveseven more. 

The effectiveness of 3-D audio surround Systems is greatly improved if the System can com¬ 
pénsate for head movement. The listener wears a head-tracking system that allows the experi- 
menter to monitor all six degrees of freedom: pitch, rolI, and yaw plus the standard {x, y, z } 
Cartesian coordinates of position in space (figure6.24). Additionally, if the sound spectrum iswell 
known to us, or we can observe the position of the sound source visual ly, we can discern whether 
spectral features are due to the sound source or to our own ATF. 

6.13.3 T he Precedence E ffect 

When we hear a sound in a natural environment, we hear not only thesignal thattravels through 
the air directly to our ears but al so its many echoes reflected off nearby surfaces. These echoes 
superi mpose an i ncoherent j umbl e of del ayed and scaled copi es on top of the di rect si gnal, whi ch 
is then delivered to our ears to sort out. 

Performance of the ITD cue degrades in highly reverberant environments because it depends 
upon receiving coherent phase Information between thetwo ears. The ILD cuefares somewhat 
better with reverberaron because it only looksfor level differences. But itisstill subjectto con¬ 
fusión in roomswithstrong standing wavepatternswhereintensitiesof sound are subjectto local 



Psychophysical Basisof Sound 


193 


mínima and maxima determined by room geometry. For example, if I play a 500 Hz continuous 
si nusoi d over a I oudspeaker i n a room and have you wal k around in it, you'll get different i mpres- 
sions of the location of the speaker at different positions in the room because of the standing 
waves. 

The saving grace in all this is that, generally, the direct path to our ears from the source is 
shorter than any of the refIected paths; thus we hear sounds traveling on the direct path first. Our 
hearing iskeenly awareof this fact and gleans as much di rectional information as possi ble from 
the onset of the sound before the reflections begin to arrive. This is the precedence effect 
(Wallach, Newman, and Rosenzweig 1949; Haas 1951). In general, sound location is perceived 
to be in the direction from which the fi rst signal arrives, so long as the strongest echoes arrive 
within about35 ms,they arespectrally and temporallysimilar, and they are not much louderthan 
the di rect signal. 

U nder these circumstances, echoes are suppressed by the precedence effect. The precedence 
effect can be demonstrated with a stereo audio System with loudspeakers separated by a few 
meters. Set the Controls to monophonic reproduction so that the same signal issentto both speak- 
ers. First, stand in frontof thespeakersexactly on the midline between them while playing some 
music or speech. You will hear the signal i n front of you, somewhere between the speakers. I f you 
then move toward one of the speakers by a meter or so, suddenly you will believe that only the 
nearby speaker ismaking any sound— it's as though thedistant one was switched off. B utthe other 
speaker isclearly still contributing loudness and spaciousness to whatyou hear, which you can 
demónstrate if you havean accompliceunplug thefar Ioudspeaker. You will noticea reduction in 
overall loudness and a reduction in spaciousness. But if it is plugged back in, you still can't hear 
the sound as arriving from it, and your senseof the direction stays with the local speaker. 

6.13.4 T heTrade-off between Time and I ntensity 

ITD and ILD seem to beprocessed by the brain separately before being combined ata higher level 
with other cuesto model lateral position of sound. This can beexploited within a certain rangeto 
play off ITD and ILD cuesagainsteach other. 

If you stand in the median plañe between two loudspeakers (called the sweet spot in the audio 
literature) fed with a stereo sound signal, you will hear the stereophonic sound field i n frontof you. 
If you move too far to one si de, the sound field collapses and the precedence effect reinforces the 
percept that the nearby speaker is the location of the sound source. H owever, to a certain extent, 
thi s can be compensated for by boosti ng the i ntensity of the far speaker unti I it overcomes the IT D 
cue, and you can restore again the sensation of bei ng i n the sweet spot. The ear apparently weighs 
the ratio of ITD and ILD cues to determine lateralization. 

6.13.5 DistanceCues 

How do weknow how far away a sound source is? Supposel set up two loudspeakers in a room 
behind an acoustically transparent but visual ly opaque screen. The fi rst speaker is 3 meters in 
frontof you and I play a sound at ¡ntensity I. Supposel then switch to asecond speaker attw i ce 



194 


C hapter 6 


the distance and play the same sound with the same intensity I. You’d have no trouble telling 
which was the el oser sound source: because of the inverse square law, the intensity of the direct 
signal arriving from the far speaker is l d = J ¡; thereforeyou hearthesecond speaker as farther 
away. 

B ut suppose we do a second experi ment w here I secretl y i ncrease the i ntensi ty of the far speaker 
to / 2 , so that now l d = Jí 2 , and repeatthe procedure. Though the inverse square law cue is now 
gone, you wil I still correctly tell me which speaker isthefarone and wi II perhapsalso mentionthat 
I appearto have made the far onelouder. How did you figure that out? 

F or every sound, your heari ng j udges not j ust the i ntensi ty of the di rect si gnal / d but al so the rati o 
of the direct signal intensity to the attendant reverberant signal intensity R as a cue for distance. 
In thefirst experiment, we're pumping the same intensity / into the room from either speaker; 
therefore the average reverberant intensity in the room is R no matter which speaker plays. Rever¬ 
berant energy i s di stri buted uniformly throughout the room qui ckly after a sound starts. B ut mean- 
while the direct signal intensity went from / in the cióse speaker to J\ inthefarone. Thus, your 
earjudged that 

R R 

and reasoned that if the reverberad on intensity stayed the same but the di rect si gnal intensity went 
down, then the second speaker must be farther away. 

However, in the second experiment, the intensity in the room goes from / to I 2 . Therefore the 
amount of reverberation in the room likewise goes from R to R 2 . But meanwhile the intensity of 
the di rect si gnal that you heard remained the same. (Becausel squared theintensity ofthedistant 
speaker, the direct signal strength you experi ence from either speaker is identical.) Thus, your ear 
judged that 

!>± 

R R 2 

and reasoned that if the direct signal intensity remains the same but the reverberant intensity 
increases, the sound must be both farther away and louder. 

We can confi rm that your heari ng i s factori ng reverberati on i nto i ts cue for di stance by repeat- 
ing this experiment i n an anechoic chamber. Asthenameimplies, it isa room that is so padded 
that it produces no echoes, depriving you of the reverberation cue. This time you would expe¬ 
ri ence the second experiment as ambiguous and wouldn't be able to tell which speaker was 
farther away. 

Another distance cue is based on the fact that high frequencies are absorbed more quickly by 
air than low frequencies. The greater the distance, the more the high frequencies in a signal are 
attenuated. Theeffectismoreexaggerated with greater humidity. So even in a largespacewithout 
echoes— likeaflat desert—you can still tell relative distance because your hearing has a built-in 
senseof how much air absorbs high frequencies. 



Psychophysical Basisof Sound 


195 


6.14 Timbre 

"Timbre," a word borrowed from French, is sometí mes defined as "sound color." Here'stheANSI 
(1999) definition: "Timbre is that attribute of auditory sensation in terms of which a listener 
can judge that two sounds similarly presented and having the same loudness and pitch are 
dissimilar.... Timbre depends primarily upon the spectrum of the stimulus, but it also depends 
upon thewaveform, the sound pressure, thefrequency location of the spectrum, and the temporal 
characteristics of the stimulus." 

The fi rst sentence of the A N SI defi nition i s an exampl e of the common but not very hel pful ten- 
dency to define timbre by what it is not. According to such negative definitions, timbre is what's 
leftover after pitch, loudness, and duration are accounted for. Let'scall this approach theresidue 
theory of timbre. The second sentence is a bit more hel pful. 

M usicians havea highly developed, though informal, description of timbre. M usically, timbre 
can refer to the features of a tone that serve to i dentify the i nstrumental source, such as oboe or vi o- 
lin, or the i nstrumental family, such aswoodwindsorstrings. Alternatively, timbre can denote the 
semantic quality of a musical tone, such as dark, dull, bright, orshrill. The most useful terms are 
those that reí ateto measurablephenomena, such as sharpness, theratio of high frequency energy 
to total energy. Sharpness can be thought of as the "center of gravity" of the spectral envelope of 
a sound (B i smarck 1974). R oughness characterizes tones or noises that contai n frequency orampli- 
tude modulations between about20 and 200 Hz (see section 6.7; and volume2, chapter 9). 

Positive theories of timbre have only recently begun to arise. The theoretical difficulties stem 
partly from the multidimensional complexity of timbre (Licklider 1951; Plomp 1970) and partly 
from thebias toward viewing timbre as the residueof pitch, loudness, and duration. 

M odern psychoacoustical research into timbre has soughtto understand 

■ What are the principal perceptual structures the auditory system usesto determine timbre? In 
other words, what timbral effects are we sensitiveto, and in what order of precedence? 

■ How does the auditory system categorizeand order timbre? In other words, doestheear havea 
natural taxonomy of sounds? 

Research has shown that the two most significant perceptual structures of timbre are spectral 
energy distribution and evolution of spectral energy distribution over time. Thus, timbre consists 
primarily of thestatic and dynamic propertiesof asound's spectrum, leaving aside pitch, loudness, 
and duration. Timbre Identification has been shown to depend a great deal on spectral evolution. 

A Ithough this defi nition isstill basically a residual defi nition of timbre, atleastit suggests a way 
to take a smal I step forward. Suppose we take a col lection of instrument tones and normalize them 
so thatthey all have the same perceived pitch, loudness, and duration. Then any remaining differ- 
ences between the tones would be, by definition, their timbre. We could then do experiments on 
the normal ized col lection to study how subjects experi ence the differences between the tones and 
try to understand from this how the auditory system organizes and categorizes timbre. 



196 


C hapter 6 


Such research was carried out by John Grey (1975). 13 Although his entire experiment goes 
beyond thescope of this book, in brief, he recorded a col lection of standard orchestral instrument 
tones and performed a set of experiments to normalize them for pitch, loudness, and duration. 
Therefore, by definition, the normalized instrument tones differed only in timbre. Subjects were 
then played each possiblepairing of thesetonesat random and asked after each pair how dissimiIar 
they were on a scale of 1 to 10. The experiment generated thousands of perceptual dissimilarity 
judgments from a multitude of subjects. Thesejudgments allowed Grey to construct a multidi- 
mensional constellation of the orchestral Instruments where the distance between all the Instru¬ 
ments is proportional to how dissimilar each tone isfeltto be from all theothers. It is important 
to note that the experi ment i ncl uded no hypothesis a priori; al I G rey started with were the di ssi m- 
ilarity judgments of his subjects. 

The next step was to determine whatfeaturesof the soundstimul i might bestaccountforthedif- 
ferences his subjects heard. Heused multidimensional scaling (M DS) techniques (Kruskal 1964) 
to reduce the number of dimensions of the dissimilarity judgments into a set of distances in 
three-dimensional space. Figure6.25showsaview of the di ssi mi I arity j udgments expressed asdis- 
tances i n three di mensi ons. A bbrevi ati ons for the i nstrument tones are 01,0 2, oboes; C1, C2, clar- 
inets; XI, X2, X3, saxophones; EH, English horn; FH, French horn; SI, S2, S3, strings; TP, 
trumpet; TM, trombone; FL, flute; BN, bassoon. 

It is important to note that the data specify only relative distances between data points. Grey 
examined the data in one, two, and three dimensions in all possible rotations and decided that the 
three-di mensi onal ori entation show n i n f i gure 6.25 offered the best possi bi I i ti es for expl anati on of 
timbre d i fferences. 

I n this rotati on, G rey noted the y-axis relates to the spectral energy distribution. O n the one 
extreme, the french horn (FH) and strings (S3) have reíatively narrow spectral bandwidth (fewer 
harmonics) with most energy concentrated in the lowest harmonics. At the other extreme, the 
trombone (TM) has a very wide spectral bandwidth (many harmonics) with energy more evenly 
distributed among them all. 

Thex-axis relates to temporal energy distribution, specifically to how partíais align during 
attack and decay. At one extreme, higher harmonics of the woodwinds enter and exit simulta- 
neously with the low ones at onset and termination of a note. At the other extreme, the higher 
harmoni es of stri ngs, brass, f I ute, and bassoon tend to enter after the I ower harmoni es and exi t more 
quickly than the lower ones. 

Thex-axis alsoexpresses musical i nstrumentfami lypartitioning. The woodwinds appearon the 
far left, the brass in the middle, and the strings on the far right. The exceptions to this pattern are 
the clustering of bassoon with the brass, and flute with the strings. 

G rey al so i nterpreted the z-ax¡ s i n terms of temporal patterns. A t one extreme, the stri ngs, f I ute, 
clarinets, saxophone (XI, X2), and oboe (01) display initial high-frequeney low-amplitude 
energy, mostoften inharmonic, during the attack segment. The tones at the other extreme, includ- 
ing brass, bassoon, and English horn, either have low-frequency inharmonicity or at least no 
high-frequeney initial energy in the attack. 



Psychophysical Basisof Sound 


197 



Figure 6.25 

Three-dimensional hierarchical clusteringanalysisof timbresimilarities. (AdaptedfromGrey 1975.) 


Grey also performed a clustering analysis of the data. The solid I ines i n figure 6.25 indicatethe 
strongestclustering, followed by dashed and then dotted lines. Forinstance, string SI is mostlike 
theflute, and string S2 is most Iikestring S3. AIso, thegroup {SI, FL} is more Iike thegroup {S2, S3} 
than thegroup {FH, B N ,TP}. Last, thegroup {SI, FL} is morelikegroup {F H, B N ,TP} than anything 
except group {S2, S3}. 

Thecomposer Henry Cowell (1930) wrote, 

If tone-qualities were arranged in order, and a notation found for them, it would be of assistance to com- 
poser and performer alike.... Tone-quality thus becomes one of the elemente in the composition itself 







198 


C hapter 6 


and ceases to beonly a matterof performance.... Progress in thefield of new orgraduated tone-qualities 
in composition has been greatly hindered by lack of notation, as it has been justly feltthat if music demand- 
¡ng new tonal valúes were set down in present notation, the desi red effect would be likely to be entirely 
lost in the performance. (34) 

Figure 6.25 provides composers with interesting information on the use of timbre as an orga- 
nizi ng pri nci pl e i n composi ti on. F or i nstance, to rei nforce the i ndependence of two melodic li nes, 
composers could choose timbres that are far apart in the figure. One can entertain such ideas as 
transposable timbre by moving gradually from mellow to bright ¡nstruments. Timbres with more 
bite tend to stand out in an ensemble because their onset transíents tend to contain a higher 
percentage of total energy. 

Grey's research is but onestudy with a very narrow focus: it covers only 16 specific soundsat 
onepitch,oneduration,andoneloudnesslevel. Weknow littleaboutthespacesbetween thesetim¬ 
bres, let alone the possi blemapsof timbre space that would ariseusing other control parameters. 
Theorchestral ¡nstruments themselvesarenotconstantin timbre atdifferentpitches, durations, and 
loudnesses. For example, the ti mbre of the clarinet demonstrates a wide range of effects across its 
pitch range. Nonetheless, Grey has provided atempting glimpse. 

6.15 Summary 

T he most sal i entaspectsof musical sound arepitch, loudness, duration, sound location, and timbre. 
Psychometric scalessuch asdecibels and phons weredeveloped to give some objectivity to sub- 
jective judgments. W hile such units provide a common languagefor discussing the auditory abil- 
i ti es of a popul ati on of I i steners, they shoul d not be consi dered to be i n the same I eague as physi cal 
measurements. N onetheless, subjecti ve judgments can help to expose a coherent subjective struc- 
ture if collected over a sufficient number of data points, and this structure can in turn be related 
to various acoustical parameters such as amplitude and frequency. 

6.16 Suggested Reading 

A lien, J. B„ andS.T. Neely. 1997. "Modeling the Relation Between the I ntensity J N D and Loudness for Puré Tones and 
Wide-Band Nois e." Journal of the Acoustical Societyof America 102 (December): 3628-3646. 

Bosi, Marina, and Richard E. Goldberg. 2003. Introduction to Digital Audio Coding Standards. Dordrecht, The 
Netherlands: Kluwer. 

Green, D. M. 1976. An Introduction to Hearing. Hillsdale, N. J.: Erlbaum. 

M oore, Brian. 1986. Frequency Selectivity in Hearing. San Diego: Academic Press. 

-. 1995. Hearing. San Diego: Academic Press. 

-. 1997. An Introduction to thePsychology ofHearing. 4th ed. San Diego: Academic Press. 

Pickles, J. 0.1988. An Introduction to thePhysiologyofHearing. 2d ed. San Diego: Academic Press. 

Plomp, R. 1976. Aspects ofToneSensation. San Diego: Academic Press. 

Tobías, J. V., ed. 1970/1972. Foundations ofModern Auditory Theory. 2 vols. San Diego: Academic Press. 

Yost, William A. 2000. Fundamentalsof Hearing: An Introduction. 4th ed. San Diego: Academic Press. 



7 


Introduction to Acoustics 


A physicistwholooksbackoverthehistory of hissubjectisstruckby theprominentplacethatwasoriginally 
occupied by musical acoustics. I n fact it was one of the important sources of information about the nature of 
the physical world and a prime source of intellectual stimulation. 

— Arthur H. Benade, TrumpetAcoustics 

7.1 Sound and Signal 

Having focused on the listener in the previous chapter, I now focus on the médium and consider 
how sound travels. 

The sounds we hear correspond to pressure disturbances in the médium we are immersed in— air 
or water. In chapter 61 mentioned that sound implies a source, a médium, and a receiver. This raises 
the age-old question: If atreefalls in theforest and there’s no oneto hear it, did it makeasound? 

Onecould argüe that pressure disturbances i n air are not sound until asubjectexperiencesthem, 
butthisseemsacademic.A way outofthedifficulty istodifferentiatebetween a sound and a signal: 
A sound has meani ng by the i nformati on i t conveys f rom a source to the receiver; a si gnal i s a sound 
that conveys such information. Therefore, an unheard sound is not a signal, but it isstill a sound. 
Practically speaking, justasthere is no harm in talking about asunset (even though it'stheEarth 
that turns), it's okay to discuss the propagation of sounds and signáis without regard to a source 
or receiver so long as we’re aware of the potential for contradiction. 

7.2 A SimpleTransmission Model 

One simple model of the transmission of signáis that takes into accountthe receiver, source, and 
médium is 

Receiver Source Médium 

Observed sound = Original sound - Transmission loss, 

where the transmission losses are spreading, absorption, and scattering of sound on the path from 
so urce to receiver. 



200 


C hapter 7 


For sound transmission to carry information (for it to be a signal), it must bedetectableatthe 
receiver, whether the ear or a microphone. Ordinarily, this requires that 

■ The sound must be above the receiver's threshold of sensitivity and within itsfrequency range. 

■ The sound must be greater in strength than the ambient noise, that is, the signal to noise ratio 
mustbegreaterthan 1. 

Otherwise the signal is consi dered to beburied in thebackground noise. For signáis meeting these 
detection criteria, we can get a rough prediction of whether a signal can be heard by relating the 
ambient noise level to the sound intensity level: 


Signal to noise ratio = 


Observed intensity level 
Ambient noise intensity level ‘ 


If the result is greater than 1, it'sa reíatively safe assumption that it can bedetected (although this 
must remain a rough estímate becausefor hearing, detectability is potentially affected by masking). 


7.3 How VibrationsTravel in Air 

The speed of sound is a function of how quickly a médium can transport energy by wave motion. 
This in turn depends upon the physical properties of the médium. 

When energy is injected into a médium, itseeksto return to its lowest energy level by radiating 
the energy away. I n an elastic médium such as air, energy can be radiated either by heat convection 
(heat exchange between adjacent molecules) or by wave motion, or both. W hen a sound wavetrav- 
els through a gas, the regions that are compressed become siightly warmer, and the regions that 
areexpanded become si ightly cooler. But the wavelength for most audible sound isrelatively large 
i n comparison to the rate that heat flows through air, so no appreciable heat flows from a conden¬ 
saron to an adjacent rarefaction. So most of the work in sound transmission happens because of 
pressure changes ratherthan thermal convection. 

A system that performs work without heat flowing into or out of it is adiabatic. U nder normal 
atmospheric conditions, and for most audio frequencies of interestto humans, energy propagation 
by convection is much slower than energy propagation by sound wave, so air is considered an adi¬ 
abatic médium. 

I n the medium'sundisturbed State, molecules of fluid media such as air col liderandomly inthree 
dimensions under theforces of thermally induced motion. Average partiele speed is proportional 
to temperature. 

We can usefully think of air as an ideal gas, representing its molecules as a collection of per- 
fectly hard spheres that col I i de but otherwise have no interaction with each other. An ideal gas 
stores all its energy in the translational velocity of the particles (that is, the particle speed). The 
random moti on of the mol ecul es i s the ai r's internal energy or microscopio energy, as di sti nct from 
theair's macroscopicenergy, whichcharacterizesthelarge-scalemotionofanairmassasawhole. 
Sound and wind areforms of macroscopic energy. 



I ntroduction to Acoustics 


201 


v w x y z 


h i -- j k 

Figure 7.1 

Idealized one-dimensional representad o n of air. 

I magi ne a packet of ai r such as that enclosed by your Iungs. 

■ Ifyou movetheair packet (by breathing inorout),energy istransferredfromyourlung muscles 
to the air and stored in the momentum of the air partióles— an inertial property of air. 

■ Ifyou compress the air packet (by holding your breath and squeezing your chest and diaphragm), 
energy is stored as heat— an elastic property of air. 

In either case the energy is stored in theair's momentum or compression because all the energy 
(except that I ost to f ri cti on) w i 11 be reí eased agai n w hen the ai r packet decel erates or the ai r packet 
decompresses. U nderstandi ng wave propagad on depends on understandi ng how the medium's 
inertial and elastic properties interact. 

Figure 7.1 presents an idealized one-dimensional representad on of air, in which small packets 
of ai r molecul es are represented as bal I s {v, w, x, y, z } between spri ngs {h, i,j, k}. I n the begi nni ng, 
the spri ngs are al I pressi ng w i th even forcé upon the bal I s, the forces bal anee, and there i s no move- 
ment, correspondí ng to the ambient background air pressure. 1 

If I give ball x an instantaneous shove right, itfurther compresses spri ng y and expandsi. The 
forcé from x to y grows while the forcé from x to w shrinks. Therefore, first w and y are drawn to 
the right, then v and z, and so on. The movement of segment {x, y, z} isa compression wave, and 
the movement of {x, w, v} is an expansión wave. 

If stiffer spri ngs are used, or if the springs are more compressed, the forcé displacing x would 
beconducted toy and w f áster. Therefore, speed of wave propagad on goes up with increasing stiff- 
ness. J ust as the stiffness of a spring is raised as the pressure on it grows, so the stiffness of a gas 
is increased by raising the pressureP it undergoes. 

G ases are compressi ble to the extent that they can easiIy convert pressure i nto i nternal energy, 
and differentg ases h ave d i ff erent energy-stori ng capaci ti es. A gas w i th a h i gher hea t capa ci ty ra tío 
Y is li ke a spri ng with greater inherent stiffness: it compresses less easi ly, that is, it requires more 
forcé to compress, and therefore it can store more energy per unit of volume than a more com- 
pressible gas with lower y. So the elastic properties of a gas are its pressureP and its ability to store 
heat y Increasing either P or y (or both) increases wave propagad on speed. 

Now for the inertial properties. If the mass of the bal Is i n figure 7.1 were increased (and the 
springs were left unchanged), the bal Is would have more inertia, and the forcé displacing x 
would be conducted to y and w more slowly. The inertial property of a gas is its density p, 
defi ned as its mass m per unit of vol ume V, or p = m/V. I ncreasi ng p decreases wave propagad on 
speed. 



202 


C hapter 7 


The phase of matter has a hugeimpacton the i nertial and elastic properties of different media. In 
general, solids have greater stiff nessthan I iquids, which havegreater stiffnessthan gases. Forthis rea- 
son, longitudinal sound waves travel fasterin solids than in liquidsor gases. One might think that 
becauseof gases' relatively small mass per unitvolume, thespeed of sound would befaster i n gases. 
But the stiffness is so much greater in liquids and solids that, in general, for speed of sound c, 

C solid > ^liquid > ^gas ■ 


7.4 Speed of Sound 


To summarize the foregoing, 

■ Increasing elastic properties P ory (or both) increases wave propagation speed. 

■ Increasing the ¡nertial property p decreases wave propagation speed. 

Combining these observad ons, wecan say that the speed of sound c s in a gas is proportional to the 
ratio of its elastic and ¡nertial properties: 

c rc elasticity ^ yP (7 1} 

s density p 

Sinceenergy is proportional to thesquareof velocity, wecan rewrite (7.1) asa proper equality: 

c 2 = t. 

s P' 

and thus the speed of sound is 


R 


Speed of Sound (7.2) 


A11 we need now i s to fi nd appropri ate val ues for P, y, and p for ai r i n order to determi ne the speed 
of sound in air, butto find them requires afew additional discoveries. 


7.4.1 HeatCapacity 

H eatis energy that flowsfrom a higher-temperatureobject toa lower-temperatureobject. Because 
itisakind of energy, its unit is thejoule (J), the same unitused for work, kinetic energy, and poten- 
tial energy. 

Heatthatflows originates in the internal energy of the hotter substance. Internal energy is the 
sum of the molecular kinetic energy (the random kinetic motion of the molecules), molecular 
potential energy (forces acting within and between molecules), and otherforms of energy. The 
internal energy of a substance is not called heat unless it is flowing. 

Theamountof heat needed to raisethetemperature of a substance by a certain amount— its heat 
capacity- depends upon the kind of substance and upon the mass of the substance. The heat 



I ntroduction to Acoustics 


capacity Q of material sean beshown to bedirectly proportional to thechangein temperatura AT 
and the amount of mass m, so that Q °c nn AT. A ddi ng a constant of proporti onal i ty c, the specific 
heatcapacity, allowsus to determine the heat capacity of a specific material, Q =cmAT. The valué 
of c must be determined experimentan y for each specific material. Solving for c, we have a way 
to determine the heat capacity of specific materiais: 

c = Specific HeatCapacity (7.3) 

mAT 

F rom (7.3), theSI unitfor specific heatcapacity isj/(kg • C°). F orexample, the specific heat capac¬ 
ity ofcopper i s 387 J/(kg • C°), and the specific heat capacity of water (atl5°C) is4186J/(kg • C°). 

The specific heat capacity of gases isdifferentdepending upon whether the gas ismeasured with 
constantpressure or constantvolume. This distinction is usually not importantfor solidsand liq- 
uidsbutcan be significantfor gases, such as air. The specific heatcapacity measured holding pres¬ 
sure constant i s cali edc p , and thespecific heatcapacity measuredholding volume constant i s cal led 
c r For example, the constant pressure specific heatcapacity c p of oxygen is 912 J/(kg • C°), and 
the constant volume specific heat capacity c v of oxygen is 651 J/(kg ■ C°). 

The importanceof this distinction may notatfirstbe obvious, but itturns outto be crucial for 
correctly calculating the speed of sound. Newton first analyzed the speed of sound in The Prin- 
c/'p/a. H is analysis was correct, but the predi cted resultwas farsmallerthan measured valúes. This 
probl em dogged theori sts for the better part of a century and set back the progress of acousti es unti I 
the difference between c v and c p was discovered, and the mystery was solved. 

7.4.2 HeatcapacityRatio 

The ratio of c p /c v , the heat capacity ratio, characterizes the inherent molecular springiness of 
a gas: 

y = ^. Heatcapacity Ratio (7.4) 

It is the ratio of the specific heat capacity of a gas at constant pressure to the specific heat capacity 
at constant volume. 

If we compress a gas, we add to its internal energy, causing its temperatura to rise. The com- 
pressibi lity ofagas dependson how its particlesaccommodatechangeof heat energy. This inturn 
determines the ratio of the change in heat energy to the change in temperatura. 

For an ideal gas, c p exactly equals c v so that y— c p /c v = 1.0. This means that the specific heat 
capacity is the same whether we hold pressure or velocity constant. If we double the pressure on 
an ideal gas, the volume is halved. 

If Y is greater than 1.0 (because c p >c v ), the gas i s not i deal. N onideal gases store energy i n the 
transí ational velocity aswell as the rotational velocity and vibrational velocity of the particles. For 
nearly diatomic gases such as air, y is 7/5 = 1.40. 



204 


C hapter 7 


7.4.3 MassDensity 

Having consi dered theelastic propertiesof air, wemust next look at its inertial properties before 
we can establish the physical basis of the speed of sound. Wave propagation is slower in more 
dense media because denser partióles accelerate less quickly for the same applied forcé, therefore 
they communicate their forcé to their neighbors less quickly. 

The mass density p of an undisturbed gas is its massm per unitof volumel/: 


p=^, MassDensity (7.5) 

wheremass isthequantityof mattercontained in an object, and matterisanything thatoccupiesspace 
and exhibits inertia. The SI unit of mass density is kg/m 3 . For example, the mass density of helium 
is 0.179 kg/m 3 , and the mass density of air is 1.29 kg/m 3 (Beranek 1986). To determine the speed of 
sound, we must determine the mass density of ai r, which is composed of numerous different gases. 

Air is composed of about 78 percent nitrogen (N 2 ), 21 percent oxygen (0 2 ), 0.9 percent argón 
(Ar), and 0.03 percent carbón dioxide (C0 2 ) by mass, and its average molecular mass is the sum 
of the products of the various atomic masses times their percentages. So to determine the average 
massof air, we must first determine the atomic mass of the individual gases, which wedo as follows. 

Th emole is the SI base unit for expressing theamount of a substance measured in molecules. 
Amedeo Avogadro (1776-1856) discovered that 12 grams of carbon-12 contains 6.022 x 10 23 
atoms. This number is known as Avogadro's number N A . One mole (abbreviated mol) of a sub¬ 
stance contains as many particles (atoms or molecules) as N A . 

Since 1 mol has the same number of atoms regardless of what the substance is, the difference 
in mass between substances is due to the difference in their molecular weights. For example, 
6.022 x 10 23 atoms of carbon-12 weigh 12 g, and the same number of nitrogen atoms weigh 
28.013 g.Table7.1 shows calculations of average molecular mass for air. If 6.02 2 x 10 23 particles 
of air weigh 28.87 g, one particle weighs 


m _ 28.87q . 1 mol _ 4.79xlQ- 23 q 

mol 6.02 2 x 10 23 particles particle 

or 4.79 x 10 -26 kg per air particle. 


Average M ass, A tom ofAir (7.6) 


Table7.1 

Average M olecular M ass of A ir 


Element 

Percent 

Atomic M ass 

g/mol 

N 2 

78.08 

x 28.013 

= 21.87 

0 2 

20.95 

x 31.998 

6.70 

Ar 

0.934 

x 29.948 

0.27 

C0 2 

0.031 

x 44.010 

0.01- 




28.87 



I ntroduction to Acoustics 


7,4.4 Pressure, Volume, and Temperature 

Pressure P is forcé per unit area, measured in atmospheres. A n atmosphere (atm) is defined as the 
averageatmosphericpressureatsealevel,withastandardizedvalueof 101,325 Pa(pascal),alittle 
over 10 5 N/m 2 , or about 14.7 pounds per square inch (see section 4.21). M easurements of air 
density are made by reference to standard temperature and pressure (STP), defined as 1 atm of 
pressure atO°C, or 273.15 Kelvin. 

The pressurefIuctuations of sound waves are very small in comparison to standard atmospheric 
pressure. Sound pressure level (SPL) rangesfrom about 0.1 Pa at thethreshold of hearing up to 
about 1 Pa at the limit of hearing. This corresponds to a fluctuation of between 10- 7 N/atm and 
10“ 5 N/atm. 

If we have a volume of a fixed size, and we add more molecules of gas to it (for example, by 
pumping more ai r into a ti re), the pressure increases. W hen the volume and temperature of an ideal 
gas are kept constant, doubling the number of molecules of air doubles the pressure. So pressure 
is proportional to the number of molecules, or equivalently, to the number of moles n of the gas, 
so wecan writeP n. 

If we have a volume of variable size, and we add more molecules of gas to it (for example, by 
pumping more air into a balloon), the volume increases. When the pressure and temperature of 
an ideal gas are kept constant, doubling the number of molecules of air doubles the volume. So 
volume isalso proportional to the number of moles n, and wecan writel/« n. 

If we have a fixed number of molecules of air in a certain volume and we reduce the volume, 
the pressure increases, as happens when pressing down on the plunger of a bicycle pump with the 
outl et closed. W hen the number of molecul es and the temperature of an ideal gas are kept constant, 
halving the volume doubles the pressure because the same number of molecules now occupy half 
the space. So for an ideal gas, pressure and volume are reciprocal: 

PocI, or PVoci. 

The final factor we must consi der i s temperature!. When volume is kept constant, raising the 
temperature of an ideal gas raises the pressure. A nd when the pressure is kept constant, raising the 
temperature of an i deal gas i ncreases the vol ume. So temperature affects both pressure and vol ume 
directly, and wecan writeP V °c T. 

Combining thesefour proportional i ti es, wecan writeP V °c nT. Wecan rew rite this asan equa- 
tion by inserti ng a proportional i ty constant R, cal led the universal gas constant. The result isthe 
ideal gaslaw: 

PV = nRT, Ideal GasLaw (7.7) 

wheren isthe number of moles of gas presentáis absolute temperature in Kelvin, andR isamagic 
number worked out experimentally in the late 1700s and codified in Boyle's law, Charles's law, 
and Avogadro’s hypothesis. A good valué for it is 8.314351 J/(mol • K). 



206 


C hapter 7 


Equation (7.7) expresses the ideal gas law in terms of moles n, but using Avogadro's number, 
it is easy to express this ¡n terms of the total number of partióles N. The total number of partióles 
N is just the number of moles n times the number of partióles per mole, N A , in other words, 
nN A =N. Substituting into (7.7), we have 

p V .n»T.^y.„(£.y. 

The constant term R/N A is Boltzmann's constant, 2 usualIy represented by the Symbol k: 

k = S- = — 8.31 J /(mol ■ K) — = 1 38 10 _ 23 j /K Boltzmann's Constant (7.8) 

N a 6.022 x 10 23 atoms/mol 

Substituting k into the ideal gas law, we have 

PV =NkT. Ideal Gas Law using Boltzmann's Constant (7.9) 

N otice that the valúes of (7.7) and (7.9) areexpressed only in unitsof pressure, volume, quantity 
of gas, and temperature. From these wecan calcúlate the speed of sound in air. 

7.4.5 C alculating the Speed of Sound 

Sol vi ng (7.9) for pressure, weget 

P=^kT. (7.10) 

This can be rewritten in terms of massdensity p and molar massm because/V/l/ =p lm (a conse- 
quenceof equation (7.5)). Substituti ng this equality into (7.10), weget 


p=£<rr. 

A rearrangement of (7.11) yields 

P _kT 

p m ' 

Now, recall equation (7.2) for the speed of sound: 



Substituti ng (7.12) into (7.2) yields 



(7.11) 


(7.12) 


(7.13) 



I ntroduction to Acoustics 


207 


We have an equation for the speed of sound based only on easily measurable quantities: temper¬ 
atura, compressibiIity, and mass. Plugging in the valúes y = 7/5, m = 4.79 x 10 -26 kg/m 3 , and 
k = 1.38 x 10 -23 J/K yields 

c ‘=f%=JF§E w? = M0S33JT - SpeetofSound (7.14) 

Temperature (in Kelvin) is the only nonconstant term for the speed of sound in airat standard 
atmospheric pressure. To use the more familiar Celsius scale, substitute T = 273.15 = 0°C. For 
example, the speed of sound at standard temperature and pressure is 

20.0833 VT = 20.0833 • 16.52725 = 331.9 m/s, Speed of Sound atSTP (7.15) 

or a little over 1000 ft/s or about 1 ft/ms. 

Air is nondispersive, meaning that c s does not changewith frequency, as lightdoes in glass, for 
example. 

7.4.6 Universal WaveEquation 

Now that we have an analytic means to determine the speed of sound, wecan use it to relate the 
period of a wave directly to its frequency. The speed of sound c is 

c = fXmls, Universal Wave Equation (7.16) 

wherefisfrequency incycles per second, and Xis wavelength inmeters. Knowingany twoofthese 
properties allows us to find the third: 



For example by (7.15), the frequency of a wavelength of 10 m is f = 33.19 H z. The wavelength of 
1000 Hz in air at STP is A, = 0.3319 m, or about 1 ft. 

7.5 Pressure Waves 

Sound is propagated through a médium such as air by longitudinal waves, where particle motion, 
wave motion, and energy flow are all in the same direction. Longitudinal waves are al so cal led 
pressure waves (or P-waves) because wave propagation is carried by pressure differences in the 
médium. A two-dimensional representad on of a longitudinal pressure wave isshown i n figure 7.2. 
The figure can be thought of as a density contour map of the médium where the darker areas, or 
compressions C , have greater density than the I i ghter areas, or rarefactions R . T he orderi ng of den- 
sities isalways R <p 0 <C, where p 0 is density of air atSTP. 

As previously discussed, when a packet of air is compressed, the molecules store the energy as 
heat. Whenapacketof ai risaccelerated, the molecules store the energy asinertia. Ifwethink about 



C hapter 7 



Figure 7.2 

Longitudinal pressurewave. 

figure7.2 asasnapshotof asound waveform in time, weseethatthewavetravelsby alternating 
inertia storage and heatstorage through time. At any moment, the most compressed/rarefied air 
packets (henee theoneswith the most/least heat) haveno momentum, whereastheairpacketswith 
the most velocity (henee the ones with the most momentum) have no compression. 

7.6 Sound Radiation Models 

Sound propagateswithaspherical radiation pattern infreespaceof uniform density, 3 and theinten- 
sity of asignal atsomedistancewill generally beinversely proportional to thesquareof thedis- 
tance. But the actual distribution of transmitted sound energy depends upon other factors, 
including the radiation pattern of its source, which is a function of theefficiency of sound prop¬ 
agaron in three dimensions. For instance, we experience the loudest signal from a violin if we 
directly face its top; it sounds quieter if weface its side. 

A dditional ly, sound can be reflected, refracted, or absorbed on its way from a source to a receiver 
by objectsencountered along the way. For ¡nstance, weexperience an even quieter signal from a 
violin by standing directly behind the violinist, in the player's acoustical shadow. Whatarrives at 
ourearsisdetermined by thesefactors (among others) in combination. 



I ntroduction to Acoustics 



Figure 7.3 

Graphical models of thetransmission of sound. 

T he transmi ssionof sound can be modeled as eitherwavesorrays. Figures 7.3a and 7.3b show how 
these models represent two-dimensional waves in cióse proximity of a point source radiating 
in alI directions.Nearthesource,weseegreatcurvatureofthewavefronts. lnthreedimensions,wave 
fronts emerge in layers of pressure areas from the source in a spherical pattern. In figures 7.3c and 
7.3d, weareviewing sound at intermedíate proximity from itssource, in thenear field orthe Fresnel 
zone. Equivalently, weareviewing a magnified portion ofthesoundthroughasmall aperturenearby 
its source. Though the total 2-D signal path isstill circular, ourview of itisso limited thatwhatwe 
see beg i ns to I ook more I i ke paral I el raysor paral leí wavefronts. Figures 7.3e and 7.3f show thesound 
atgreatdistancefrom ¡ts source, in the far field, or Fraunhofer región. Equival ently, weareviewing 
at extreme magnification through a very small aperture nearby the source. We see so little curvature 
thatfor all intents and purposesthe raysand wavefronts are paral leí. I n this región, the radiation pat¬ 
tern is independent of distance. These are called plañe waves, al though strictly speaking they are only 
geometrically planar in the limit atan infinite distance from the source. In threedimensions, plañe 
waves pass by like sheets of pressure areas from the direction of the source. 

Portraying transmission of sound as rays helps us visualize the direction and strength of sound 
propagation, and portraying them as waves helps us visualize wavelength. Both approaches are 
just models of the underlying physical phenomena, which we use to help make sense of what we 
experience. Wecan adopt whichever perspective helps us understand, and switch back and forth 
between models atwill. 

At what point do we go from near field to far field, from effectively spherical to effectively 
planar transmission? Three factors are significant: the distance from the source s 0 , the area of 
the aperture a through which we observe the waveform pass, and the wavelength X of the wave- 
form (see section 4.24.4). An observation is termed far field if the distance from the source 
is much greater than double the area a of the aperture divided by the wavelength X, called the 
Rayleigh distance: 



Rayleigh Distance (7.17) 




210 


C hapter 7 


7.7 Superposition and Interference 

Wave interference occurs when two or more waves act simultaneously on a médium. When such 
waves pass through each other in an ideal médium, the resulting disturbance at any point in the 
médium can befound simply by adding the individual displacements that each wavewould have 
caused by itself. This isth eprincipie of superposition. 

W here the interferí ng waves have the same sign, thesum oftheir displacements will belarger 
than eitherwaveby itself, resulting in constructiveinterference. Thus, forexample, when thecrest 
of one wave is superposed upon the crest of another, they i nterfere constructively. The same goes 
for a trough superposed upon a trough. 

W here the interferi ng waves have opposite sign, the sum of their displacements wi 11 besmaller 
than either wave by itself, resulting in destructive interference. This occurs, for example, when a 
crest i s superposed upon a trough, or viceversa. Forexamples of constructive and destructive inter¬ 
ference, see figure 6.13. 

W here two waves of oppositesign and equal magnitude coincide, they cancel, resulting in no 
displacementof the médium. A listener at that position would hear silence. This is essentially the 
principiebehind noise-canceling headphones. A microphonemounted neartheeardetectsan inci- 
dent sound wave, and an electronic Circuit creates an inverse waveform matching the incident 
sound wave, and plays it through the loudspeaker in the headphones so that the i ncident sound is 
canceled by the inverse waveform when it reaches the ear. 

7.8 Reflection 

Interpreted as rays, sound obeys Newton's laws of motion because sound rays continué in a State 
of motion "at a constant speed along a straight I ine, unless compelíed to change that State by a net 
forcé." Reflection acts as a net forcé upon a wave to deflect its di rection. 

Reflection of sound waves occurs only where the speed of sound changes, which, according to 
(7.2), happensonly where the density or el asticity of the médium changes. Reflection occurs at the 
boundaries between media with different densities and elasticities, for instance, where sound 
strikesa wall. Butreflections can al so occur withi nthesame médium where its density or el asticity 
changes. Forexample, theenormous impulsivesound wavecreated by a boltof lightening in the 
clouds would be heard as a single clap, like a sonic boom, exceptforthe numerous reflections 
causedbytheturbulentdifferencesinpressureandtemperaturewithinthestorm. We hear the effect 
of these ref I ecti ons as rol I i ng thunder. 

Sound reflection isjust like light reflection: the angleof the ¡ncident ray and the reflected ray 
lieon the same plañe, and theangles of incidence 0 y and reflection 0 r are equal (figure 7.4a), 

0 r = e,. Law of Reflection ( 7.18) 

Specular reflection, which is reflection from smooth, reíatively plañe surfaces, creates a phan- 
tom source equidistant to the perpendicular of the reflecting surface (figure 7.4b). N onspecular 



I ntroduction to Acoustics 


211 



Figure 7.4 

Phantom reflection source. 


reflection (figure 7.4c) creates dispersión, scattering, or diffuse reflection. If the reflecting 
surface is sufficiently diffuse, no phantom source is created. 

Looking at reflection from the wave perspective, we'd say that each local point on the reflecting 
surface emits a new spherical ly spreadi ng wave front i n response to the i ncident wave. The direc- 
tion in which this new reflected wave front is propagated is constrained by 

■ The local geometry of the surface it strikes 

■ The pressure it experienees from other local wave fronts 

This characterizes any reflection, specular or not. For a plañe wave striking a plañe surface, the 
hemispheres radi ated by each poi nt form a coherent wave front that resembles the i mpi ngi ng wave 
front but traveling in a new direction. For nonspecular reflection, wemust examine the way each 
hemisphereis constrained by ¡ts local surface and the i nfluencesofnearby hemispheres i n response 
to their own local conditions. 

A wave front will experi ence scattering if the dimensions of the object it encounters are 
small in comparison to ¡ts wavelength. A wave front will experience reflection if the dimen¬ 
sions of the object it encounters are large in comparison to its wavelength. For example, 
a table top 1 m in di ameter will tend to scatter wavelengths larger than 1 m and reflectsmaller 
wavelengths. 

Assuming that the direct path between source and receiver is not blocked, the first sound we 
hear from a source is always the direct signal because it travels the shortest distance between 
source and receiver. Since there is strong evol utionary survival valué in knowing the di rection 
from which a sound originates, our hearing suppresses reflected copies arriving after the direct 
signal from other directions (seesection 6.13.3). B ut the precedenceeffectonly worksfor about 
35 ms after the arrival of thedirect signal. Reflections arriving after that are experi enced asdis- 
ti nct echoes. R ef I ections within the precedence i nterval are experi enced as I end i ng spaci ousness 
to the sound. If there are so many reflections that wecannot distinguish them, we hearthem as 
reverberation. 



212 


C hapter 7 


7.8.1 Determining Distancefrom Reflections 

Reflected energy can be used to measure distanceto a reflective object because some energy usu- 
ally returns to the source. This is the way radar works. The same technique also works for sound: 
for example, some cameras are equipped with a distance-finding devi ce consi sting of an element 
that makes a highly directional clicking sound, and a microphone. Theclick emitted by thedevice 
reflects off the object the camera is aimed at, and some of the energy returns to the device's micro¬ 
phone. The device measures the elapsed time and uses equation (7.16), the universal wave equa- 
tion, to calcúlate the distanceto the object. 

Since sound can penétrate opaque objects, sound reflection can be used to map underground 
rock strata. Using acoustic pulse reflectometry, geologists track the reflections of a shock wave 
transmi tted through the earth by setti ng off a smal I expl osion and recordi ng the echoes (fi gure 7.5). 
By examining the del ay and amplitude of the most promi nent reflections at various microphones, 
and knowing the average speed of sound in the earth, geologists can infer the depth and location 
of strata of different densities. 

The same principie has been applied to deriving the bore profile of tubular musical Instruments 
such as windsand brass, and thetracheal tubesof humansand animáis, using a noninvasive tech¬ 
nique (Ware and A k¡ 1969). Though theseinstrumentscan in principie be measured with calipers, 
in practicecomplications such as sideholesand the inaccessibiIity of interior tubes Iimitthe accu- 
racy of this approach. Forthevoice, a noni nvasi ve approach to measuring thetracheal tubes of live 
subjects is invaluable. With acoustic pulse reflectometry, an impulse of sound is injected into the 
instrument. Reflections arise in response to the impulse where the bore diameter changes. The 
more rapid the change in bore diameter, the larger the reflection it causes. The resulting impulse 
response function is measured by a microphone. The impulse response is converted into a 
cross-sectional area as a function of axial distancefrom the impulse source, providing the desi red 
bore profile. 4 



Figure 7.5 

Acoustic pulse reflectometry. 


I ntroduction to Acoustics 


213 


7.8.2 TheOId RopeTrick 

Wecan model sound reflection in transversewavesby attaching aropeto a wall. Holding thefree 
end, if you snap the rope, an impulsive wave travelsdown its length. When the wave reaches the 
wal I, some of the wave energy reflects back towards your hand whi le some is transmitted i nto the 
wall. The returning wave experienees phase reversal, meaning that the returning pulsetravels on 
the opposite side of the rope: if the outgoing pulse travels above the rope, the reflection returns 
below the rope (figure 7.6a). This is how sound travels in strings with rigid terminadons. 

A ttaching the end of the rope to a I ightweight thread and the thread to a wal I makes the rope act 
as though it werefree at the wall end (figure 7.6b). The reflected wave remains on the same side 
of the rope both coming and going, so there is no phase reversal. 

W hen the rope's end is fixed, as i n figure 7.6a, we can thi nk of the rope’s reflected wave as com- 
¡ ng from an imagi nary i nverted source on the other side of the wall (figure 7.7). If the rope isfixed 
to the wall, the wave's displacement clearly mustbezero when it reaches the wall (assuming the 
wall is inflexible). We would achieve the same effect if we had an ¡dentical rope on the other side 
of the wall thatwas shaken the opposite (i nverted) direction.Theimaginary ¡nverted wave arri vi ng 
from the i magi nary source would exactly cancel the real source when thetwo motions meetatthe 
wall, al so provi di ng zero displacement. 


Wall 

□ 


b) Free end 
Outgoing wave 

Returning wave 

Figure 7.6 

Reflection of an ¡mpulsive wave on a rope 



No phase re versal 



Real _ 
source 


Waves cancel 



Wall 


Figure 7.7 

Rope with fixed end. 



214 


C hapter 7 


Waves combine 



Figure 7.8 

Ropewithfreeend. 



Figure 7.9 

Shive wave machine. 


H owever, when the rope’s end is free, as i n figure 7.6b, theend snaps I i ke a whi p, momentari ly 
doubling its displacement. We can model this as an uninverted imaginary wave arriving from 
the imaginary source, so the waves add when they meet, providing twice the displacement 
(figure 7.8). 

7.8.3 Shive Wave Machine 

M ost natural wave motion occurs too fast or is otherwisetoo subtleto follow easily with the eye. In 
the 1950s, John N. Shive developed a wave machine at Bell Telephone Laboratories that clearly 
reveal s transverse wave moti on. A n array of stiff Steel rods are attached crossw i se at regul ar i nterval s 
to a wire (figure 7.9). Because of the relatively large inertia of the rods compared to the elasticity of 
the wire, a wave takes several seconds to travel from one end of the array to the other. If the tips 
are painted with phosphorescent paint and theapparatus is viewed under black light, only thetips are 
visible, and one sees an array of dots moving up and down in transverse wave motion. This can 
be used to replícate the experiments with the ropes. With the far end free, as in figure 7.9, a positive 
impulse sent down the wave machine results in a positive reflected wave, as in figure 7.6b. With the 
rod at the far end clamped so itcan'tmoveup ordown, a negative wave i s reflected, as in figure 7.6a. 

7.8.4 Reflection and Transmission at M edia Boundaries 

The Shive wave machine can al so be used to examine what happens when waves cross the bound- 
ary between two media with different speeds of sound. The crossbars at the right end of the wave 
machine shown in figure 7.10 have been shortened so that the speed of wave propagation along 
this section of the machi ne is doubled. 5 Cal I the slower speed of wave propagation i n the long bars 



I ntroduction to Acoustics 


215 



Figure 7.10 

Reflection and transmission ata boundary. 


c s and the faster speed in the short bars c f . The figure shows a wave Crossing over the boundary, 
one half still in theslow section, one half in thefast section. 

Wecan compare what happens when a wavecrosses from a slower to a faster médium (send 
an impulse from (a) in figure 7.10) and from a faster to a slower médium (send an impulse from 
(b) in the figure). 

■ If, as in figure 7.9, the bars of the Shive machine do not change length, then c s lc f = 1 and the 
speed of propagation remains unchanged. All energy is transmitted along its entire length, none 
is reflected until it reaches the end. 

■ If, as in figure 7.10, the bars change length, then c s /c f * 1. Because there is a difference in the 
speed of propagation, some energy is transmitted and some is reflected. 

■ The initial disturbance that movestoward the barrieris the/'nc/'dent wave, the ref/ected wave is 
returned from the boundary, and the transmitted wave passes through the boundary. 

■ Thesumofthe reflected and transmitted energy always equals the total original energy (apart 
from that lostto friction or sound). 

■ Wherewavemotion goes from the slower médium into the faster médium, as in figure 7.10a, the 
returning wave does not experience a phase reversal, likethe rope attached to a string. 

■ W here wave moti on goes from the faster medi um i nto the slower medi um, as i n figure 7.10b, the 
returning wave experienees a phase reversal, likethe rope attached to a wall. 

■ I n no case is the phase of the transmitted wave reversed. 

■ The frequeney of the wave is preserved across the boundary. 

We can accountfor these phenomena as follows. Suppose there is a boundary where médium 1 
has speed of propagation c x on one si de and médium 2 has speed of propagation c 2 on the other. 
Assume that the incident wave always starts in médium 1. Then the amplitude coeffident of the 
reflected waveA r will be 

A r = ^. 


Reflection (7.19) 




216 


C hapter 7 


The amplitude coefficientof thetransmitted wave A t wiII be 


A t - 


2C 2 

Ci + C 2 ' 


Transmission (7.20) 


Forexample, if the speed in médium 1 istwicethatof médium 2, so thatc^ = 2/1, and theinci- 
dent amplitude is equal to 1, then by (7.19) the amplitude of the reflected wave will be-1/3 and 
by (7.20) the amplitude of the transmitted wave will be 2/3. This is equivalent to starting theinci- 
dentwavefrom (b) in figure 7.10. 

If we reverse this so thatc^ = 1/2, then reflected amplitude is 1/3 and transmitted amplitude 
is 4/3. This is equivalent to starting the incident wavefrom (a) in figure 7.10. 

Ifc 1 /c 2 = l/l, refl ected ampl itude is 0 and transmitted amplitude is 1. Wecan use theseequations 
to determine the behavior of the rope described in section 7.8.2: 


■ In the limit, if the second medium’s speed of propagation is zero, q/c^ 1/0, then reflected 
amplitude is-1 and transmitted ampl itude is 0. This is the case for the ropewithfixed end and the 
Shivewavemachineinfigure7.9with itsendclamped—all the energy i s refl ected and comes back 
inverted. 

■ If thesecond medium’s speed of propagation is infinite, the reflected amplitude ¡si (not inverted) 
and transmitted amplitude is 2. This corresponds to the rope with free end and the Shive wave 
machine with its end unclamped. 


Remember, total energy is conserved in the System, but amplitude adjusts according to the mass 
and elasticity of each médium at the boundary, 

Reflectionof longitudinal waves, suchassoundwavesinair,can bevisualized asfollows. Consider 
along pipe with adrumheadat the left end (figure 7.11a). Wecan createanimpulsivewavebystriking 


a) CI osed-ended b) O pen-ended 



Figure 7.11 

Closed-ended and open-ended reflections in a tube. 











I ntroduction to Acoustics 


217 



Figure 7.12 

Parabola focusing sound on a microphone. 

the drum head, which sends a Sharp positive pressure wave down the tube. If the tube is closed at its 
right end, the positive pressure doubles when the impulsive wave strikes it, and the wave is reflected 
back as a positive pressure wave— the same as described for a free-end rope reflection. 

However, ifthetubeisopen atits right end, the wave interacts with theairsurroundingthemouth 
of the tube. When the positive pressure wave exits the tube, it displaces the ai r, which isat normal 
atmospheri c pressure around the mouth. T his displ acement is then propagated away from the open- 
ing asahigh-pressurewave. A new low-pressureareasurrounding the mouth iscreated in itswake. 
Air from outside and inside the tube is drawn to this new low-pressurezone. The air outside the 
tubethatisdrawn back then propagates away from the tube as a low-pressure wave, and theai r that 
is drawn from inside thetube then propagates back upthetube as a low-pressure wave (figure7.11b). 
This is the same as described for a fixed-end rope reflection. 

When sound refl ects from a concave surface, wave i ntensity isfocused exactly as I ight intensity 
isfocused in a reflecting telescope. If a microphone is placed atthefocus of a parabolic dish, sound 
waves arrivi ng from the same di recti on as the directrix (the I i ne b¡ secti ng the refl ector) are focused 
on it because each pointon the parabola is equidistantfrom thefocusand thedirectrix (figure 7.12). 

7.8.5 Acoustical Coupling 

There are situations where it would be good to have as much energy as possi ble travel from one 
médium into another whi le reducing oreliminating reflectionsdueto the discontinuity at the bar¬ 
rí er between the two medí a. F or i nstance, the boundary at the oval w i ndow i n the ear has ai r outsi de 
and denser perilymph inside (seesection 6.2.3). If sound waves struck the oval window d i recti y, 
mostenergy would reflect back outof theearand littlewould getinsideto enablehearing.Thetym- 
panumandthebonesofthemiddleearprovidetheinnerearwithamechanical coupling that passes 
al most al I energy into theinnerear for a rangeof frequencies of greatest biological i nterest, thereby 
providi ng the i nner ear with a more i ntense si gnal at these frequencies. Such coupl i ng devi ces are 
cal led transformers. 

Figure 7.13 shows the Shive wave machine adapted to couple energy across a boundary with 
minimal reflection. The averagespeed of sound in the central región isthegeometric mean of the 



C hapter 7 



Figure 7.13 

T ransformer at a boundary. 

speed in thetwo surrounding media. The transformer couples most efficiently for frequencies 
whosewavelengthsareclosetofourtimesthelength of the transformer. Itworks progressively less 
well for other frequencies. 

7.9 Refraction 

Thatpartof asound’senergy that istransmitted through a boundary betweentwo media entersinto 
the new médium and is subject to refraction. Suppose a plañe wave strikes a plañe surface with 
a different speed of sound c. If the angle ofincidence a of the wave is not perpendicular to the new 
médium, the wave is bent upon entering itand goesoff ata different angle (3. 

There are three cases to consider: 

■ The speed of sound is the same i n thetwo media: Cj«C 2 .Then a sound that strikes the surface 
of thesecond médium atan angle of incidenceawill enterthenew médium at angle (3, and a = p 
(figure 7.14a). 

■ Thespeedof sound isfaster inthefirstmédium: c x > c 2 ■ Theangleof incidence ot,> (3, resulting 
in a focusing effect (figure 7.14b). The energy isfocused because relatively wideangles of inci¬ 
dence result in relatively narrower angles of refraction. 

■The speed of sound i s faster i n the second medí um: q < c 2 . T hen a < p, resul ti ng i n a dispersi ve 
effect (figure 7.14c). The energy is dispersed because relatively narrow angles of incidence result 
in relatively wider angles of refraction. 

I n general, for some angl e of i nci dence a, the sound wi 11 be refracted i nto area Q if q < c 2 and i nto 
area P if q > c 2 (figure 7.14d). We see that the angle of incidence a is related to angle of entry 
P by p = Ka, where/C c 2 /c 1 . 

Theexact relation, known as Snell's law, is 

sin g = sin P 
Ci c 2 


Refraction (7.21) 



I ntroduction to Acoustics 


219 


Solví ng for (3, we have 
(3 = sin'^sina^j. 

Forexample, if c x = 1.0, c 2 = 1.25, and theangleof incidenceisa =45°, theangleof entry is 
(3=62.1°. 

W here an incident sound wave moves from a slower to a faster médium (q < c 2 ), there is a crit- 
ical angle a crit that causes the angle of entry to equal 90°. W hen a =a crit , the energy of the refracted 
wave (called a creep wave) travels along theboundary, and is rapidly attenuated. When a>a crit , 
all ¡ncident wave energy is reflected. The valué of the critical angle is 

—m ' 

7.9.1 GradientRefraction 

Figure 7.14 shows refraction at sharply defined boundaries between media of different densities, 
but refraction takes place wherever any parameter affecting the speed of sound changes for any 
reason. Refraction can occur continuously over a gradient, for example. 


a) C\ = c 2 b) q>c 2 



Figure 7.14 

Refraction ata barrier. 



220 


C hapter 7 


a) Cooling with elevation b) Warming with elevation 



Surface of theearth 


Figure 7.15 

Refraction of sound in the atmosphere. 

On astill evening with no clouds, the atmosphere near the ground remainswarm becauseof the 
ground's lingering heat, and the air generally is colder with increasing elevation. As the density 
increases over a gradient, the speed of sound correspondíngly decreases. Figure 7.15a shows a 
sound source atsome elevation above the su rfaceof the earth.Thewarm airbelow causes the wave 
frontto speed up, the cool air above causes itto siow down, and the result is that the wave front 
bends upward with ¡ncreasing distancefrom the source. 

On a still morning with no clouds, air in the upper atmosphere is warmed by thesun fasterthan 
ai r near the ground, so the atmosphere general ly gets warmer w i th i ncreasi ng el evati on. F i gure 7.15b 
shows the wave front bentdownward with ¡ncreasing distancefrom the source by theearly morning 
refraction gradient. Sound can often be heard on the surface of the earth at a farther distance when 
refraction isdownwardundertheseweatherconditionsbecause the wave frontstravel aroundthetops 
of surrounding obstacles. I oncelived ahalf milefrom theocean. During theday, thesurf sound was 
blocked by tall cliffs. But on calm nights a temperature gradient would form, refracting the sound. 
Every night (reliably within 5 minutes of midnight, forsome reason), suddenly, l'd hear the surf. 

7.9.2 L and Speed of Sound 

Consider what happens when one shouts into the wind to a listener. The land speed of sound must 
bemeasuredwith respectto theaveragewind speed. Thus, ifc = 331 m/s buttheairitself ismoving 
at 10 m/s, the land speed of sound c, is actual ly 341 m/s in the direction of the wind. 

When wind flowssmoothly withoutvórticesorotherturbulence, ittravelssioweratground level 
than higher up. The speed of air molecules in contact with the earth must be effectively zero 
because the earth is stationary with respect to the wind and these air molecules are bound to earth 



I ntroduction to Acoustics 


221 



Figure 7.16 

Refraction dueto wind speed. 



Figure 7.17 

Sound absorption panel. 

by strong molecularforces. Air moleculesabovethem areprogressively lesssubjectto thisresis- 
tance, with increasing elevation causing theairto movein horizontal sheets, an effect called lam¬ 
inar flow.TUus the speed of wi nd— and henee the I and speed of sound— shows a gradi ent i ncrease 
with elevation in the direction of the wind. Figure 7.16 shows this effect on two listenersatequal 
distancesdownwind (a) and upwind (b) of a sound source. The listenerupwind receives Iess i nten- 
sity than thedownwind listener because upwind sound is refracted up into the sky. 

7.10 Absorption 

When sound energy is transformed into another kind of energy, such as heat, we say the sound is 
absorbed. Differentmateri alsabsorb sound tovarying degrees: a cement wall absorbs littleand reflects 
mostsound energy; awood wall absorbssomeand reflectssome; carpetabsorbsmostand reflectslittle. 

A ir itself absorbs sound energy, depending upon i ts temperatura and reí ative humidity. Cold.dry 
conditions favor sound transmission, whereas hot, moist conditions absorb sound, with high fre- 
quencies being most subjectto attenuation. This means thatthe intensity of a signal being trans- 
mitted through air will actually be less than would be predicted by the inverse square law of 
distance because the air itself dissipates energy from the sound. 

An anechoic chamber is one whose walls absorb all sound, providing no detectable echo or 
reverberad on. U sual Iy the wal I s are constructed of I arge wedges of soft, fi brous material on al I sur- 
faces. If we envision a sound wave incident upon a wedge as a ray (figure 7.17), we can see that 
refl ecti on causes the ray to stri ke the wedge many ti mes, each ti me transferri ng some of i ts energy 
as heat into the wedge until it is completely absorbed. 



222 


C hapter 7 


Table7.2 

Absorption Coefficientsof VariousM ateríais atVarious Frequencies 


M aterí al 



Frequency (Hz) 



125 

250 

500 

1000 

2000 

4000 

Concrete block 

0.36 

0.44 

0.31 

0.29 

0.39 

0.25 

Wood 

0.15 

0.11 

0.10 

0.07 

0.06 

0.07 

Carpet 

0.08 

0.24 

0.57 

0.69 

0.71 

0.73 

Air 





0.01 

0.02 


Thebestabsorber of alI is an open window: 100 percentof theenergy that goesthrough itis Iost 
to I i steners i n the room. T hus the absorpti on of al I other materi al s i s compared to that of a w i ndow 
of equal area, and the absorption coefficient of an open window is defined as a = 1. If a square 
meter of carpet absorbs half as much sound as a window of equal size, then the absorption coef¬ 
ficient of the carpet material would be a = 0.5. 

A surfacehaving an area S and an absorption coefficient a can be said to have total absorption 
A =5a, equal to an open window of area A. M ateríais vary as to how much sound they absorb in 
differentfrequency bands. The absorption coefficients of concrete, wood, carpet, and ai r are show n 
in table 7.2. The entry for air assumes a temperature of 20°C, 30 percent relative humidity. 

Our ears use attenuation dueto air absorption and surface reflection to help identify objects in 
spaceinanumber ofways: 

■ In the open, if wehearasound werecognize, but itsoundsmuffled, weassumeitisfar away. 
Themuffling isaconsequenceof the attenuation of high frequencies with distance through air. 

■ In a hall thedirectsignal from a source notonly arrivesfirst but isalso spectralIy the brightestthat 
we hear from that source because it is least subject to attenuation due to air absorption at the walls 
or in the air. Our ears compare the spectral brightnessof thedirectsignal to thereflected signáis as 
a cueto the location of the sound source (along with other cues such as i ntensity and time of arrival). 

7.11 Diffraction 

Since sound waves emanate spherically from a sound source, one might think that if an object 
blocked the sound, the size of the sound shadow would grow with distance beyond the blocking 
object (figure 7.18a). I nstead, the sound shadow shrinks with distance (especially for low frequen¬ 
cies) and even disappears (figure 7.18b). This is pretty good news if your seat at the concert hall 
isdirectly behind a largepole: you'll still be able to hear the music—at least the low frequencies. 
Thistendency of sound to spread outinto sound shadows is called diffraction. 

Diffraction arises in two common situations: apparent bending of waves around small obstacles 
orpast sharp edges, and spreadingoutof waves past small apertures. Small i n this case means small 
in proportion to the wavelength of the passing waveform. 



I ntroduction to Acoustics 


223 



Figure 7.18 

Naive expectation vs. actual behavior of sound shadows. 



Figure 7.19 

Diffraction through an aperture. 


Itisa loteasierto study diffraction of lightthan of sound becausewecanseelight. Solet'smake 
some si mpl ifying assumptionsand consider diffraction of lightthrough an aperture (figure 7.19a). 
Diffraction of sound works exactly thesameway. Assumethatthe light arrivesfrom such a distance 
that the wave fronts are vi rtually paral I el, so we can i gnore the spheri cal compl exi ti es of waveforms. 
Now imaginethatthiswavefrontimpingesonabarrierwithasmall circular aperture. The light pass- 
ing through the aperture strikes a screen behind it. Figure 7.19b shows the diffraction pattern of light 
intensity striking thescreen forsomeaperturediameterd,wavelength X, and distanceto screen z. The 
whiter the area, the more energy it is receiving. F igure 7.19c shows a cross-section of the diffraction 
pattern with intensity on they-axis. Itis understandable that the light should bemostintenseon the 
screen directly opposite the aperture. B ut what about the fringe areas that al so get light energy? 

Tounderstand the diffraction pattern, consider how the plañe wavefrontpasses through theaper- 
ture. A ccordi ng to H uygens's principie, 6 the sound wave i n the aperture behaves as if all the points 
of the wave surface within the aperture were sepárate radiating sources of sound with the same 
phase. This means that every point on the wave surface within the aperture emits vibrations not 
only directly toward thescreen butalso inall other directions, hemispherically. Thusall the points 
on thewavesurfacewithin theaperture radíate theirenergy insuchawaythatitspreadsoutuniformly 



224 


C hapter 7 



Figure 7.20 

Behavior of two points in the aperture according to Huygens. 


Reinforcing 


Figure 7.21 

Constructive interference. 


B Cancel i ng 


Figure 7.22 

Destructive interference. 

and hemi spheri cal I y i nto thearea beyond the barrí er. Figure 7.20 shows this behavior fortwo points 
within theaperture. (Weareassumingthephases of all theseseparately radiad ng points are aligned 
by the plañe wave that is driving them from behind.) 

Figure7.21showsthatvibrationsemanating from all pointsonthewavesurfacereachthecenterpoint 
A inphaseand reinforceeachother in constructive interference. (Only two points are showntokeep the 
figure simple, butHuygens’ principleholdsforan infinite multitudeof points interacting thisway.) Fig¬ 
ure 7.22 shows how the same vibrations at a different angle cancel at point B, resulting in destructive 






I ntroduction to Acoustics 


225 


i nterf erence. Waves with thesame wavelength will sometí mes add, sometimes cancel, depending 
upon 0, the angle of incidence. The same destructive interference shown in figure 7.22 would happen 
if point B were on the other side of thex-axis at the same distance from the center of the aperture. 

If weconsiderthewholescreen, therewould beadark ring with a radius 6 -A . Now imagine 
a poi nt C at twice the distance B is from A. The phase of the vi brations would once again be con- 
structive as they were at poi nt A, and there would be a white ri ng on the diffraction pattern. H ow- 
ever, it would be fainter because the distance to the screen from the aperture is greater, and light 
intensity drops with the square of the distance, according to the inverse square law. Thus consec- 
utive concentric rings receive less and less energy until they are insignificant. 

Thecross-section of the diffraction pattern (figure 7.19c) looks rather sinusoidal, although the 
val ues are al I positive and it dies away qui ckly at the edges. I n fact, this is a squared si ne wave that 
is scaled so its intensity dies away quickly off-axis. 

The approach to diffraction presented here, which considers only plañe waves, is duetojoseph 
Fraunhofer (1787-1826). Fraunhofer diffraction involves coherent plañe waves incident upon an 
obstruction. The more general case, Fresnel diffraction, isthesameexceptthatthecurvatureof the wave 
frontsistakenintoaccount.ThiswasfirstworkedoutbyAugust¡n-JeanFresnel (1788-1827).Thedif- 
fraction pattern shown in figure7.19b, historically calIed theA iry disc, 7 corresponds to Fraunhofer 
diffraction through a ci rcul ar aperture. D iffracti on through other aperture shapes requi res different 
equations. 

Before we can constructan equation for diffraction, let's first ¡solate thefactors involved: 

■ Intensity di mi ni shes as point p is moved further from the axis by increasing r (figure 7.19a). This 
is simply the inverse square law at work. 

■ D iffraction grows as the aperture becomes smal ler. A s the size of the aperture d shri nks, the total 
amount of energy passed through decreases, and the energy that still gets past is diffracted more 
strongly. Figure7.23 shows a cross-section of theA i ry disc diffraction pattern asafunction of aper¬ 
ture si ze. T he I arger the aperture, the more the energy tends to beam (and the more energy gets past 



Figure 7.23 

Diffraction pattern as a function of aperture size. 



226 


C hapter 7 



Figure 7.24 

Diffraction pattem as a function of wavelength. 


i 


I mage on the screen 
gets larger and fainter. 


Figure 7.25 

Diffraction pattem as a function of distance to screen. 

the apertura because it isbigger); thesmaller itis, the more theenergyspreadsoutacross the screen 
uniformly (and the less energy gets past the aperture because itis smaller). 

■ Diffraction isgreater, the longer the wavelength. Low frequencies (large A.) tend to diffract, and 
high frequencies (small X) tend to beam. Figure 7.24 shows a cross-section of the A iry disc dif¬ 
fraction pattern as a function of wavelength/frequency. As frequency goes from low to high, the 
energy tends to beam. 

■ A s thedistance to the screen zgrows, the i mage on the screen gets larger and fainter (figure 7.25). 

■ The diffraction pattern changeswith the shape of the aperture. Figure 7.26 shows a diffraction 
pattern made by a rectangular aperture. 

Fraunhofer diffraction by an aperture ismathematical ly equivalentto theFouriertransform of the 
aperture shape (see volume 2, chapter 3). 



I ntroduction to Acoustics 


227 






Figure 7.26 

Diffraction pattern of a rectangular aperture. 


Wecan engineeran equation for diffraction by putting all thesefacts together. Referring to the 
geometry of figure 7.19a, we know that 

■ Overall intensity and the amountof diffraction are both proportional to di ameter d of aperture. 

■ The diffraction effectismagnified asthedistancezfrom the aperture to the screen grows, while 
overall intensity simultaneously goes down. 

■ The amountof diffraction is proportional towavelength X. 

■ As we move away from the axis by a radius distance r, the intensity at pointp goes down. 

■ O veral I, the i ntensity / a is di rectly proporti onal to the amount of sound energy enteri ng the aperture. 
Then the intensity l p of energy at pointp on a screen that is distancer from the screen's axis is 


■■OT-T 


Diffraction (7.22) 


where 8 = dr/Xz. 

The terms d, X, and z control the distance between the peaks of the diffraction pattern. The peaks 
will becomewiderapartiftheaperturedismadesmaller, if thewavelength?cgrows,orif thedistance 
to the screen z grows. Terms d and z al so have an effect on the overall intensity: if the aperture d is 
madelarger, more energy isletthrough; ifthedistanceto thescreenzgrows, theintensity diminishes. 

Why does music played through a loudspeaker sound so different depending upon where one 
is listening from? On-axis, in front of the speaker, we hear a rich mixture of low and high fre- 
quencies; off-axis, the sound gets i ncreasi ngly muffled until, standing di rectly behind the speaker, 
all we hear is low frequencies. Of course, the answer is diffraction. A loudspeaker in an enclosure 
is subject to exactly the same Fraunhofer diffraction as plañe waves passing through an aperture 
(figure7.27). Wewould expectto hearthese same diffraction effects, and wedo. Figure 7.28 shows 




Figure 7.28 

Equal-energy contours. 


equal-energy contour graphsfor high-frequency energy (7.28a) and low-frequency energy (7.28b) 
from a loudspeaker into a room. 

7.12 Doppler Effect 

If a train moves pastvery quickly, we hearthe pitch rise as itapproaches and then fall as it moves 
away. Why? 

AII waves travel atthe samespeed in a uniform médium, but if the distance between receiver 
and sound source is shrinking, the more recently emitted waves do not have as far to go to reach 
the receiver. In the time it takes to produce one wavelength, the source has moved toward the 
receiver thereby foreshortening the wavelength being emitted in that direction. We hear the fore- 
shortened wavelengths as a higher pitch. The same reasoning can be used to show why the pitch 
drops for a sound source moving away. 







I ntroduction to Acoustics 


229 


7.12.1 Doppler Shift with Stationary L istener 

Figure7.29 shows how asound sourceS moving totherightcompressesthewavelengthsemitted in 
itsdirection oftravel and lengthensthoseemitted in theoppositedirection. Thus, a stationary listener 
at point A hears a higher pitch than a listener at point B as a consequence of the movement of S. 

Assuming the listener is stationary, the equation describing Doppler frequency shift f d for a 
moving sound source at frequency fand velocity u is 

f d = f- ri ' Doppler Shift (7.23) 

Whenthevelocityof thesourceu = Otheratio v s /(v s - u) = 1 and f d = f, so thereis no pitch shift. 
Butfor u > 0 (correspondíng to the source moving toward the listener) the denominator is smaller 
whilethenumeratorremainsconstant;thereforetheratio v s /(v s - u) > 1,causingtheDoppler-shifted 
frequency f d to go up. If the source is moving away, u < 0, and the ratio becomes v s /[v s - (-u)] < 1, 
causing the Doppler-shifted frequency f d to go down. For example, if the source is moving toward 
us athalf the speed of sound, then f d = f [v s /(v s - 0.5v s )] = 2f,andwehearthefrequencyfshifted 
up exactly one octave (figure 7.30). 



Figure 7.29 

Doppler shift. 



Figure 7.30 

F requency shift heard at half the speed of sound. 



230 


C hapter 7 


Wemightexpectthatifthesound source weretomoveaway athalfthespeedof sound,wewould 
heara one-octaveshiftdown infrequency, but since f d = f[v s l(v s - (-0.5v s ))] = f 2/3, thepitch 
dropsonly by afifth. 

If the source is traveling away from the listener at the speed of sound, then 

f f ' V s f 
d v s + v s 2 ' 

a dropofan octave (figure 7.31). I f the source i s traveling toward the I i stener at the speed of sound, 
then 



so waves emitted exactly in the di rection of travel at the speed of sound stack up on top of each 
other and form a single pulse with infinitefrequency (seefigure7.31). A I i stener standing nearby 
would first hear a sonic boom, then the sound source shifted down an octave as it flashed past. 8 

W hat happens if velocity exceeds the speed of sound, that is, u > v s ? Theoretically, the listener 
would hear the sound backward, and only after it had passed by (figure 7.32). 


1/2 f 



Figure 7.31 

F requency shift heard at the speed of sound. 



Figure 7.32 

Supersonic Doppler shift. 



I ntroduction to Acoustics 


231 


7.12.2 Doppler Shift with Stationary Sound Source 

Another possibiIity for Doppler shift is when the sound source is stationary and the receiver is 
approaching it. The equation for this case is 

= Doppler Shift, Receiver Moves (7.24) 

If the receiver moves away from the source at the speed of sound, then f d = f [(v s - v s )/v s ] = 0 
because the receiver is traveling at the same speed of the sound; all frequencies are shifted to 0. 
If the receiver approaches the source at the speed of sound, then f d = f [(v s + v s )/v s ] = 2 f, for a 
shift up by an octave. 

7.12.3 Doppler Shift with Source and Receiver M oving 

If the source and receiver are moving with velocities u s and u r , respectively, the equation becomes 
f d = fDoppler Shift, Both M ove (7.25) 

7.12.4 Two-Dimensional Doppler Shift 

Equation (7.25) and all the other Doppler equations in the precedíng sectionsareonly accuratein 
onedimension,thatis,wherethesourceandlistenerareheaded either di rectly away from ortoward 
each other. However, most listeners prefer to stand to one si de of speeding trains when observing 
thei r D oppl er effect. A I i stener el ose to the trai n tracks experi enees a sharper sw i ng i n pi tch as the 
train passes by than a listener some distance away. W hat accounts for the difference? 

I'll only consider the case where the source travels in a straight line past a listener in two dimen- 
sions, as when atrain goes past someonestandingbesidethe tracks. However, since paths can usually 
be broken dow n i nto a sequence of I i near seg ments, thi s approach can be general ized with little effort. 

In this case, wecan no longerrelyon absol utevelocity;wemustlookatthere/at/Veve/oc/íybetween 
source and receiver. H ow the distance between source and receiver changes through time determines 
Doppler shift. Figure 7.33 shows a sound source moving on a straight line at some absolutevelocityu 


12 3 4 



Figure 7.33 

Trajectory of a moving sound source. 



232 


C hapter 7 


past a stati onary I i stener at di stance d. CI early, posi ti ve D oppl er shift i s greatest at posi ti on 1,1 ess at 2. 
There is no Doppler shift at position 3 because there is no relative velocity between source and 
receiver— for a moment they are neither moving closer ñor further apart. After position 3, Doppler 
shift starts going negative, and the negative Doppler shift at position 4 matches the positive Doppler 
shift at 1. Doppler shift wiII be máximum at both horizons, and zero atthe pointof closestapproach, 
G eometri cal ly, Doppler shift is proporti onal to the rati o of the arctangent of the I engths of si des 
x and d, as shown in figure 7.33. The relative velocity 2 from the source to I i stener in terms of the 
source velocity u and the distance of the I ¡stener from the source's path d can be expressed as 

* u 2atan(-x/d) ^ 
n 

wherex is the location of the source along its trajectory (with respect to the point of nearest 
approach). Setting u = ¿ i n the Doppler shift equation for a moving source given in (7.23), wehave 

f A = f - 5 -. Doppler Shift ¡nTwo Dimensions (7.26) 

v s - ^ ü 2atan(-x/cf) _ 

We can test (7.26) by setting the distance d to 0, so the source heads directly at the listener, 
and we should get the same Doppler shift results as before. A nd, indeed, when d = 0, 
[(2 atan(-x/d))/jr] -1 = 1 nomatterwhatxis, sothattermdropsoutof (7.26), and wehavejust 
(7.23) again. Figure 7.34 plots the Doppler shift that the listener wiII hear if a sound source pro- 
duci ng a puré tone at 440 H z goes past at half the speed of sound. The curves show the I ¡stener at 
1 m,5 m,50 m,and 100 mfrom theclosestapproach, and thecurveisplottedforthespan of 100 m 
on either side of the listener’s position. The closer the listener gets to the path, the closer the 
Doppler shift approaches the one-dimensional case; for instance, at 1 m the frequency is nearly 
doubled on approach, then drops nearly a fifth (to 293 Hz) when departing. 



Figure 7.34 

Doppler shift curves. 



I ntroduction to Acoustics 


233 


7.13 Room Acoustics 

Suppose we pop a balloon in a concert hall and record the result. The sound the room makes in 
responseto this brief impulseof sound is its impulse response. 

Thefirst few sound pathsfrom sourceS to receiverR areshown in figure7.35. Thedirectsignal D 
travels the li ne-of-sight path from the sound source to the microphone, arriving attimet D .Thenext 
few impulses, early reflections, arrive at the microphone after reflecting from nearby surfaces. 
These inelude the first-order (one bounce) reflections labeled 1,2,3, and 4, and thesecond-order 
(two-bounce) reflections labeled 5 and 6. M any other possi ble paths from source to receiver are 
notshown. In addi ti onto these paths, therearealso reflections from theside walls, from the stai rs, 
and from thestage. Overtime, thereare so many reflection paths that the sound field in the room 
ends up composed of plañe waves distributed with uniform randomness in all directions. 

Figure 7.36 shows an idealized impul se response of a hypothetical room. The original impulse 
occursattimef: 0 .Thedirectsignal arrivesatthemicrophoneattimet D . Dependinguponthegeom- 
etry of the room, the early reflecti ons may occupy thefi rst 10 to 100 ms of the room's reverberation 



Figure 7.35 

Direct sound path and early reflections in a concert hall. 


Directsignal received at microphone 
,* First reflection received at microphone 



Figure 7.36 

Idealized impulse response of a hypothetical room. 






234 


C hapter 7 



Figure 7.37 

Impulse response of a cathedral (0.5 seconds). 

time.Thetimedelay of each reflection isproportional tothetimeittakestheimpuIsetotravel from 
the sound source to the walls and then to the microphone. The amplitude of each reflection is 
inversely proportional tothedistancetraveled,directly proportional tothesizeof thereflecting sur- 
face, and ¡nversely proportional to the material thesurface is madeof (among other factors). 

The remaining reflections, late echoes or the reverberation tail, are the result of the combina¬ 
torio explosión of múltiple reflections over time. The sound energy in a good-sounding hall 
declines approximately in an exponential curve after the source has stopped emitting sound. The 
shape of the curvéis i nfluenced by the position, orientation, and characteristics of the sound source 
and listener as well as their placement in the room. The notion that the room response can be 
idealized intodistinctsectionsdoes notnecessarily bearoutwell in practice. 

In addition to reflection, sound distribution isalso ¡nfluenced by spreading, absorption, refrac¬ 
tan, and diffraction. Figure 7.37 shows thefi rsthalf-second of the i mpulse response of a cathedral. 
Whilesomeearly reflections stand out, itisremarkablehow quickly and uniformly the density of 
reflections builds up. The room response shows a uniform, gradual buildup until about 180 ms, 
then a gradual decline, providing a rich reverberant background. The long reverberation tail (not 
shown) is audible for about 10 s after the impulse. 

7.13.1 M usical C haracter of Rooms 

M usicof a particular styleisgenerally designed foracharacteristiclisteningenvironmentand may 
not sound good if reproduced inan uncharacteristic setting. Forexample, plainchant(amonopho- 
nic vocal styleof the M iddle Ages in Europethat is made up mostly of long sustained tones and 
slow tempos) was designed for highly reverberant cathedral spaces where sound lingers for 10 s 
or longer. Highly intricate and rhythmically active polyphonic music of the Baroque era was 
designed for halls with reverberation times of about 2-3 s. Plainchant in a Baroque concert hall 
sounds thin and exposed, and when polyphonic Baroque music is performed in cathedrals, the 
sound lingeringfrom previous notes interferes with subsequent notes, making it difficult to follow 
the intricate Iinesof the music. Architectural taste and musical tastego through cycles of fashion 
and convenience. A late-romantic symphony by Gustav M ahler, with itsfocus on sonority, calis 
for longer reverberation time; a string quartet from any era sounds best in a more modest room; 
organ music usual ly calis for c ath ed ral -1 en gth reverberation. 



I ntroduction to Acoustics 


235 



álliikiiliikkw 

™"T 

\ 

Slap-back 

echo 


Figure 7.38 

Impulse responseof a bad sounding room. 

Forspeech, theprimary objectiveisintelligibility ratherthan agraceful reverberation. Figure 7.38 
shows thefirst half-second of the impulse response of a concrete tunnel with severeacoustical prob- 
lems. Its evenly spaced early reflections at 10 ms intervals produce a flutter echo caused by the 
parallel wallsof the tunnel, which givesan unpleasant shuddering quality to the room response. At 
100 ms, aslap-back echo fromanothersectionof the tunnel reaches the microphone. Since thisisout- 
side of the i nterval masked by the precedence effect (section 6.13.3), it is heard as a sepárate acous- 
tical event, competing with thedirect signal for audition and severely degrading intelligibility. 

7.13.2 Reverberation Time 

In the 1890s, WallaceSabine(1868-1919) waspresented by Harvard University with thechallenge 
of tami ng the bad acousti es of a I ecture hal I. The room was unusable because its reverberad on ti me 
was excessive, and it also had other problems similar to those shown in figure 7.38. H is elegant 
solution created thefoundationsof thefield of architectural acoustics. Sabine(1921) described the 
problem with his characteristic lucid prose: 

In the lectureroom of Harvard University,... awordspoken inanordinary toneofvoice was audibleforfive 
and a half seconds afterwards. During this time even a very delibérate speaker would have uttered 12 or 15 
succeeding syllables. Thus, thesuccessiveenunciations blended into a loud sound, through which and above 
which it was necessary to hear and distinguish the orderly progression of the speech. 

Sabine reasoned that two principal factors competed to determine the reverberation time of a 
room: its boundary surface area and its internal volume. 

■ Surface area. W hen sound is reflected from a surface, a great deal of its energy is lost. Some is con¬ 
verted into heat in thewall, and some is transmitted through the wall to theoutside. In eithercase, the 
sound isabsorbed becauseitis removed from the room (seesection 7.10). I ncreasing the boundary sur- 
face area reduces the reverberation ti me because then the sound has moreopportunity to be absorbed. 

■ Volume. The larger the volume of air in a hall, the less opportunity sound has to reflect off the 
wal Is and be absorbed. In comparison to the walls, air itself absorbs relatively little energy when 
transmitting sound (seetable 7.2). Increasing the internal volume decreases the rateof sound dis¬ 
si pation and therefore increases reverberation time. 



236 


C hapter 7 


From theseconsiderations, we can see that reverberaron timeT fi is proportional to the ratio of 
volumeto area: 

j x I nternal volume 
R Boundary surfacearea' 

Supposewehavea bare room with internal volume V, with hard walIs that absorb very little sound 
energy, and one wall containsan open window of areaS. AII sound that passesthrough thewindow 
to the outsi de can be sai d tobe absorbed by thewindow i n the sense that i tleaves and doesnotreturn. 
The reverberation timeT fi isequal to the ratio of the volume V to the areaS of the open window: 

T„=k-|, (7.27) 

reí ated by a constant k. W hen V i s i n m 3 and S i s i n m 2 , Sabi ne found that the constant k i s 0.161 s/m 
(see appendix A). 

Absorption Notall surfaces absorb sound at the same rate. Roomslined with carpeting and cur- 
tains absorb sound much more readily than do barewallsof brick or concrete. 

Sabine modeled the absorption of particular surface materials by comparing them to the ideal 
absorption of an open window of the same area. Since the open window absorbs all energy that 
reachesit, heassigned ¡tan absorption coefficient of a = 1. A surface material that absorbs half of 
the incident sound energy has an absorption coefficientof a = 0.5. Two square meters of this mate¬ 
rial would beneeded to replace the absorption provided by a window of 1 square meter. From this 
example, we see that i n general, a surface of area S and absorption coefficient a has an absorption 

A = aS (7.28) 

that is equival entto the absorption of an open window of area A. 

Real rooms have a variety of surfaces with different materials. The average absorption a of all 
surfaces issimply thesum of contributions from each surface that reflects sound in the room: 


1 1 N 

= - ■ (o^Si + a 2 S 2 + ■ ■ ■ + a N S N ) = - a ( S,, 


(7.29) 


where theS, are the individual surface areas, the a, are the corresponding absorption coefficients, 
N is the number of surfaces, and S is the total boundary surface area of the room. Absorption is 
sometí mes expressed in units of metric sabine, the absorption of 1 square meter of open window, 
named in honor of Wallace Sabine. 

Combining volume, surface area and absorption, Sabine’s formula for reverberation time is 



(7.30) 



I ntroduction to Acoustics 


237 


A i r A bsorption T hough the effect i s smal I i n compari son to the absorpti on of surfaces, i n a I arge 
enough hall the absorption of sound by the air itself must be taken into account. The absorption 
of air m depends upon temperature and relative humidity and is equal to about 0.012 at 20°C, 
30 percent relative humidity. Theabsorption of ai r al so depends upon the vol ume V of ai rthesound 
must travel through. So we must add a term mV to the denominator of (7.30): 

Tn = 0.161--. Sabine's Equation for Reverberation Time (7.31) 

m V + ^oCjS,- 

Frequency Response M ost surface materials and the air itself tend to absorb high frequencies 
more readily than low frequencies. Each ti me the sound strikes a wall, and the farther the sound 
travels in air, some high-frequency energy is removed from the reflections, providing a low-pass 
filter effect. The late reflections progressively darken the tone of the reverberation because each 
reflection absorbs a little more of the remaining high-frequency energy. Our ears use this cue to 
help usdistinguish the sound source from thedecaying reverberant sound field and to distinguish 
newly arriving sound from lingering sound. 

Low-frequency sound tends not to be reflected by thewalls but passes through them to theout- 
side. Each ti me the sound strikes a wall, some low-frequency energy i s transmi tted out of the hall 
and lost, providing a high-pass filter effect. If too many bass frequencies arelosttoo quickly, the 
hall reverberation sounds ti nny. Wall material must be quite dense, such as stone or bri ck, to retard 
the escape of the lowest frequencies. 

7.13.3 Sound Quality of Halls 

Thecombination of low-pass filtering of high frequencies and high-pass filtering of low frequen¬ 
cies means that the reverberation tail contains mostly low-to-mid-range frequencies. The rate at 
which frequencies in different ranges decay affects the quality of the hall's reverberation. If 
low-frequency energy lingers too long, the hall sounds "tubby." If high-frequency energy lingers 
too long, the hall lacks warmth. 

M any factors contribute to a good-sounding hall. Studies by M anfred Schroeder (1979) have 
shown the importance of having sufficient sound reflected to the listeners from the side. Lateral 
reflections with relative ti me del ays i n the rangeof 25-80 msaddafeelingof pleasantspaciousness. 

Beranek (1962) listed 18 subjective attributes that affect the quality of a concert hall. A few of 
these are intimacy, liveness, warmth, loud-enough direct sound, evenness of reverberant sound 
throughout the hall, good el ari ty or definí ti on (thedirectsignal and early reflections should not be 
lost in the reverberation), ensemble (players should be ableto hear one another easily), and suf¬ 
ficient quiet. To this list can be added no strong echoes, no flutter echoes, no focusing of sound 
by large concave surfaces, and no sound shadows underneath balconies. 

I n spite of a century of theory and experi mentation, architectural acousties is far from a Science. 
M ajor successes and catastrophic failures have been designed by architectural acousticians. 
Sabine's acoustic design for the Boston Symphony Hall made it one of the premier concert halls 



C hapter 7 


in theworld. Butitwasadmittedly acombination ofgood scienceandgoodluck. Beranek'sacoustic 
design forAvery Fisher Hall, originally called Philharmonic Hall, in New York wasan enormous 
and costly failure in spite of his extensive research, which included direct measurements of the 
world's finest concert hal Is. (Thefailurewasperhapsmoreaconsequenceofthefactthatthearchi- 
tect fai I ed to take hi s recommendati onsf ul ly i ntoaccount.) W hereasl abo rato ryscientists general ly 
can suffer thei r fai I ures i n the pri vacy of thei r I aboratori es, not so acoustici ans, w ho succeed or fai I 
very publicly. 9 

7.14 Summary 

Air is adiabatic, and asa consequence, sound travels through air in waves determined mostly by 
the mass and elastic properties of air molecules. We derived the speed of sound from underlying 
physical principies and considered types of waves and how sound radiates. 

Waves interact ¡na médium additively through constructive and destructive interference. Sound 
may bescattered or reflected dependí ng upon therelation between thescaleof the object the sound 
encounters and the frequency content of the sound. Reflections complícate the job our ears have 
to determine sound location. We can use reflection to determine distance and other properties in 
an acoustical environment without having to do direct measurements. Reflection occursat bound- 
aries between media, with or without phase reversal, depending upon whether the sound entersa 
denser médium. Sound can betransmitted moreefficiently by matching theimpedanceof thetwo 
media using transformers. 

Sound isalso subjectto refraction at the boundary between media, depending upon theangle 
of incidence. If the boundary is conti nuous, the sound undergoesgradient refraction. When sound 
energy istransformed i nto another kind of energy, such asheat, wesay the sound is absorbed. The 
tendency of sound to spread out ¡nto sound shadows is called diffraction. Doppler effect is 
the apparent change in frequency of a sound as a source and a receiver pass each other at reíative 
velociti es. 

F i nal ly, we examined the acoustics of hal Is. Wallace Sabine showed that reverberaron time 
depends upon the ratio of its boundary surfacearea and its internal volume. 

7.15 Suggested Reading 

Backus, John. 1977. The Acoustical FoundationsofMusic. 2d ed. New York: W. W. Norton. 

Benade, Arthur H. 1976. Fundamentáis of Musical Acoustics. New York: Oxford University Press. 

Beranek, Leo. 1986. Acoustics. Rev. ed. Melville, N.Y.: American Instituteof Physics. 

Hall, Donald E. 1980. Musical Acoustics: An Introduction. Belmont, Calif.: Wadsworth. 

Kinsler, Lawrence E., Austin R. F rey, Alan B. Coppens, and James V. Sanders. 1982. Fundamentáis of Acoustics. 2d ed. 
New York: Wiley. 

Rossing, Thomas D. 1983. The Science of Sound. Reading, M ass.: Addison-Wesley. 



Vibrating Systems 


Philosophy is written in this grand book— I mean theuniverse— which stands continuously open to ourgaze, 
but which cannot be understood unless one first learns to comprehend the language and to interpret the 
characters in which itis written. Itis written in the language of mathematics, and its characters are triangles, 
circles and other geometric figures, without which it is humanly impossible to understand a single word of 
it; without these, one wanders about as in a dark labyrinth. 

— Galileo Galilei, TheAssayer 

8.1 SimpleHarmonic M otion Revisited 

The basisforthe production of music and sound lies in the principies of mechanical physics. The 
physical lawsofvibrationarehighly applicableto music, becausethey determinenotonly thesounds 
instruments make but al so how the basilar membrane vibrates in response (see section 6.2.4). 

In section 1.2.2 I broached the subject of one-dimensional harmonio motion of a spring and 
weight system. In chapter 5 I related itto circular motion. Now it'stimeforastill deeperview. 
V i brati on ari ses from the i nteracti on of an el asti c forcé and i nerti a. We saw, for exampl e, that these 
determine the speed of sound (see section 7.4). 

8.1.1 E lasticity 

E lasticity is the property of a material that allows itto restore itself to its original shapeafter being 
distorted (stretched, compressed, twisted). An elastic material is in equilibrium when all forces 
applied to itsum to zero. If the sum of applied forces iszero and does not change through time, 
it ís in static equilibrium. 

Suppose I fix one end of a helical spring to a stationary object such as a ceiling, with its other 
end dangling freely. As I displace the free end away from or toward the fixed end, I distort the 
spring’s shape. The internal elastic forcé that pushes or pulís against my hand, seeking to return 
it to its original shape is its restoring forcé. 

8.1.2 E lasticity, Stiffness, and H ooke's L aw 

Suppose I apply a forcé to two elastic materials until they are displaced the same amount. If the 
forcé needed to achieve the same displacement of the two materials is different in each case, then 



240 


C hapter 8 


one materi al is stiffer than the other. T he stiffness of an el asti c materi al i s the rati o of appl i ed forcé 
to the resulting displacement. We can equate the forcé F requi red to achievea displacementx by 
inventing a mediating constantof proportionality k, which represente the strength of thecounter- 
force applied by the el asti c material: 



T he negati ve sign of /c remi nds us that the di recti on of the counterforce seeks to restore the spri ng 
to its undeformed State. Solving forF yields 

F =-kx. Hooke'sLaw (8.1) 

Thisequation 1 relates sti ffness, forcé, and displacement. Sti ffness k\s sometimes cal led thespring 
constant. The reciprocal of stiffness is called compliance. 

Hooke's law is not a fundamental physical law like Newton’s laws of motion; it issimply an 
observad on about a common physical phenomenon related to the properties of el asti c materi ais. 
In particular, the constant kis nota fundamental constantof nature buta valuedetermined exper¬ 
imental^ for each material, based on the molecular structure of the material. 

8.1.3 L inear and Nonlinear E lasticity 

H ooke's law descri bes a linear relation between forcé, di splacement, and spri ng constant because, 
when plotted, the spri ng constant isa straight line with si ope F/x = -k (figure 8.1a). 

Simple harmonic motion is the term used to describe the vibration of instrumente that are gov- 
erned by linear elasticity because their partíais are (for the most part) in harmonic relation, that is, 
they are integer múltiples of the fundamental. These instrumente inelude violins, woodwinds, 
brass, and tuned percussion instrumente. 




Figure 8.1 

Linear and nonlinear elasticity. 



Vibrating Systems 


241 


A great advantage and a great Iimitation of H ooke's law is that it does not take into account the 
extent of a material’s elasticity. No material is elastic over an infinite range. Although it may 
respond in a reíatively uniform way within a central range, beyond some point it requires much 
greater forcé to be deformed further, and eventually many materials bend permanently or break if 
forced too far. Beyond this central range, we can’t speak of a simple spring constant because the 
forcé that must be applied to achieve greater displacement does not increase i n a straight line: this 
isa nonlinear relation. Beyond this central range, we must construct a curve describí ng the mate- 
rial's stiffness as a function of displacement: F/x=-K(x) (figure 8.1b). In this case, the amount of 
restoring forcé isa nonlinear function of the amount of displacement. AII physical materials are to 
some degree nonlinearly elastic. 

T he advantage of H ooke’s law is that itsheds a great deal of lighton the natureof harmonio vibrat¬ 
ing Systems, whichaccountfor a greatdeal ofouracoustical envi ronment. Harmonio Systems are also 
typical ly easier to understand, mathematical ly. B ut i t is i mportant to remember that if we study only 
linear systems, we overlook some of the signature characteristics of musical instruments that result 
fromtheir nonlinear elasticity, and wewon’tbe abletomakesense ofhighly nonlinear vibrating Sys¬ 
tems at all. That being said, let's take the easier path and study linear systems first. 

8.2 Frequencyof Vibrating Systems 

N otice that H ooke’s law in (8.1) does not inelude mass but deais only with the elastic properties 
of objeets. Suppose I have a tethered lightweight spring with spring constant k, and I suspend a 
massm from itsfreeend. I let the spring stretch to its point of static equilibrium. After it comes 
to rest, I then displacethe spring adistancer by pulling down on it. (I pulí it only a small distance 
so that itremains i n its reí atively linear elastic range.) M oving itbydistancer required meto supply 
a forcé F =kr to overeóme the spri ng's stiffness, and the spri ng now exerts a restori ng forcé of -kr. 

If I reí ease the mass, itwill begintorise, seeki ng the spri ng’s point of equilibrium. By Newton’s 
laws of motion, acceleration of the mass is proportional to Flm, so the acceleration will be 

a = — . (8.2) 

m 

We now have two equations for acceleration: (8.2) for linear acceleration of a mass on a spring 
and (5.12) for centripetal acceleration. Thesecan beviewed asequivalent motions (seesection 5.1). 
Therefore wecan also equatetheir accelerations. Doing so, wehave 

f = kr 

r m' 

N ote that we have introduced velocity v i nto theequation. Solving for vyields 

E = r/Í 

V m ym 


(8.3) 



C hapter 8 


Now, we have two equations for the velocity of vibrating Systems: (8.3) and (5.9). Equating 
them, we have 



Notethat wehave introduced periodic timeT into the equation. Solvíng forT yields 



Recalling thatfrequency f= 1 IT, wecan write 

f= 2njm ' Vibrating Frequency (8.4) 


Equation (8.4) relates the frequency of a vibrating spring/mass System to i ts linear spring constant k 
and ¡ts massm. The equation predicts that the frequency of a vibrating system will double if the 
spring constant quadruples, and will halve if the mass quadruples. 

Forapractical example, consider the spring and weight system shown in figures 1.4 and 8.7. To 
determine its frequency of vibration, we must determine ¡ts spring constant, and the amount of 
mass. Wecan determine the spring constant by measuring thedegreeof stretch induced by gravity. 
Suppose thespring stretches by 0.025 m when loaded with 1 kg. Atthis point, the elastic forcé bal¬ 
ances the forcé of gravity, which means kl = mg. Sol vi ng for k and substi tuti ng m = 1 and I = 0.025, 
we have 


k = 


mg _ 1 ■ 9.8 
I 0.025 


392 N/m. 


With a mass of 1 kg the vibrational frequency would be 


f: _± ík _ 1 1392 

2n»Jm 2 ■ 3 . 14 a / 1 


Hz. 


Doubling the mass to 2 kg drops the frequency to 87.9 Hz. 

The method of tuning stringed instruments consists of changing the tensión of the strings by 
stretching them around tuning pegs (rather than adjusting their mass). An increase of tensión on 
the string lowers its elasticity, thereby increasing its vibrating frequency. Sincethere are practical 
limits to the elasticity of all materials, it is necessary to trade off mass against elasticity in order 
to achieve a desi red frequency. This is why instrument makers use smaller-di ameter strings for 
higher pitch, because they carry less mass per unit distance. 



Vibrating Systems 


8.2.1 Radian Frequency and Angular Velocity 

Wecan simplify (8.4) by multiplying both sides by 2 n: 

co = 2nf = Angular Frequency (8.5) 

A/m 

W hat does co signify here? Since co containsf, itstill represents frequency, but by letting coalso 
¡ndudetheterm 2 n, we get a frequency parameter thatonly i nvolves /c and m. 

Think of coas frequency expressed in units of 2 n radians. Thus integer valúes of f measure whole 
periodsof a circular or sinusoidal motion. Parameter co iscalled angular velocity or radian frequency, 
dependíng on thecircumstances, and f isjustfrequency. Forexample, if awheel rotates once per sec- 
ond, itpassesthrough 2 tt: radianseach second; therefore itsfrequency f= 1 Hz and its angular velocity 
co = 2nf= 2n rad/s. If a spring/mass System vibrates in harmonic motion once per second, f = 1 and 
radian frequency co = 2 nf= 2n. The term angular velocity is usually used for circular Systems, and 
theterm radian frequency is usually used for vibrating Systems, but they amountto the same thing. 


8.3 Some Simple Vibrating Systems 


A simple spring/mass System vibrates in one dimensión with one degree of freedom. Below are 
some other examples of simple vibrating Systems. For simplicity, none of theseexamplestakes 
friction into account. 

8.3.1 Pendulum 


A simple pendulum (figure 8.2), consisting of a massm attached to a string of length I, vibrates 
with circular harmonic motion so long as the displacementx /. 2 If themass of the string ismuch 
less than m, the frequency of vibration will be 


fu.1- 

2 tía// 


Pendulum Frequency (8.6) 



Figure 8.2 

Pendulum. 



0® 


244 


C hapter 8 


7 

i 


A 


Figure 8.3 

Pistón. 

or,expressed in radian frequency, co = Jg7\. Noticethatmassdoesnotappearinthisequation.The 
frequency of a pendulum isstrictly a function of length / and gravitad onal forcé g. 

8.3.2 Pistón 

A i r captured i nsi de a cyl i ndri cal tu be by a p¡ ston of mass m w i 11 tend to vi brate at a frequency deter- 
mined by the mass and the elasticity (otherwise known as compressibility or compliance) of the 
air (figure 8.3). The compressibility of the air depends upon a number of factors, including the 
cross-secti onal area of the cyl i nder A , the I ength of the ai r col umn /, the pressure of the gas P, and 
the heat capacity rati o y of the gas, which has a valué of about 1.4 for air (see section 7.4.2). The 
spring constant of air ¡sk = yPA/l, and the frequency is 

f=7 ■ Pistón Frequency (8.7) 

2jiV mi 

Perhapsit is not surprising thatfrequency should be proportional to inherent molecular elasticity 
and gas pressure, buttheA and / termsmay seem a I ittle counteri ntuitive atfirst glance. Why does 
frequency go up as the area increases? 

To see this, i magi ne that we repl ace the ai r w i th many very si ender spri ngs goi ng f rom the pi ston 
to thebottom of the cylinder. If we increase the length / of theaircolumn, it's as though weadd 
more spri ngs end to end in series (figure 8.4b). M any spri ngs in series are more elastic than one 
spring by itself. Thus, increasing the length is like adding moresprings in series: the compliance 
goes up, so the frequency goes down. 

If we increase the areaA of the pistón, it’s as though weadd morespringssideby sidein paral I el 
(fi gure 8.4c). M any spri ngs i n paral I el are stiffer than one spri ng by i tself. T hus, i ncreasi ng the area 
is like adding moresprings in paral leí: the compliance goes down, so the frequency goes up. 

8.3.3 Helmholtz Resonator 

I f ai r i s bl ow n across the mouth of a bottl e, the ai r stream contai ns many f requenci es, but the bottl e 
steals energy (mai nly) from j ust one frequency supplied i ntheair stream and converts itinto simple 



Vibrating Systems 


245 


^ Springsin series Springs in parallel 



Compliant M ore compliant Lesscompliant 

Figure 8.4 

Springsin series and parallel. 



Figure 8.5 

Helmholtz resonator. 


harmonic motion, which is heard as a breathy tone. Resonance is the tendency of a System to steal 
energy from, and vibratesympathetically at, a particularfrequency in responseto energy supplied 
at thatfrequency. 

The bottleacts as a Helmholtz resonator, 3 which is a variation on the pistón. The air captured 
in the neck of the bottle constitutes the mass, and the air in the chamber of the bottle constitutes 
the spri ng. The frequency of vi bration depends upon the compl ¡anee of the ai r i n the chamber and 
the mass of the air in the neck (figure 8.5). 

The resonantfrequency is approximately 


f-=¿— /A, 
2%4!V 


Helmholtz Resonator (8.8) 


where c is speed of sound, A is the cross-sectional area of the neck, v is bottle volume, and / is the 
length oftheneck. I say "approximately" becausetheeffectiveIengthoftheneckmustbeincreased 
a I ittle (an endcorrect/on) toaccountforhow theai r i n thetube" recruits" nearby moleculesoutside 
the bottle to increase the mass of the air plug. Some end correction must be applied to wind 



C hapter 8 


instruments to properly calcúlate their resonant frequency. U nfortunately, the choice of the end 
correction scal i ng term can be rather compl i cated because the amount of correcti on requi red vari es 
depending upon thegeometry and proportionsof theflange(Benadeand M urday 1967; Dalmont, 
Nederveen, and J oly 2001). 

For example, I took a standard Cabernet 750 mi wine bottle with average neck diameter of 
19 mm and neck length of 8 cm, drank its contents, 4 then calculated (with somewhat greater dif- 
ficulty than usual) asfollows: 

c = 331.6 m/s. 

A=nr2 = n(°f>) 2 m* 


U si ng an end correction of 1.5 ti mes theradius of the neck's opening yielded 
/ = 0.08 +1.5 ■ 5i?m, 

forwhich(8.8)givesaresonantfrequency of 105.7 Hz. Experimentally, the resonant frequency was 
el oser to 110 Hz, two octaves bel ow A 440, indicating thattheend correction of 1.5 was si ightly off. 

It is perhaps counterintuitive that the frequency of a Helmholtz resonator rises as the area A 
grows, but the reason is the same as for the pistón. 

The ducted port loudspeaker enclosure design shown in figure 8.6a is a practical example of 
a Helmholtz resonator. The port consists of an opening in the side of the loudspeaker enclosure. 




Figure 8.6 

Ducted port loudspeaker enclosure. 


20 Hz ■ 80 Hz 



Vibrating Systems 


247 


Theduct is a tube inserted into the port, which performsthesamefunction as thetubeat thetop 
of a Helmholtz resonator. The loudspeaker enclosure is a cavity that acts Iike the volume of a 
Helmholtz resonator. 

Ideally, loudspeakersaresupposed to becolorless reproducersof othersounds, butsincethey 
are themsel ves essenti al ly a spri ng/mass System, they have a natural vi brati ng frequency of thei r 
ow n. L oudspeakers therefore tend to exaggerate the strength of si gnal s that are near thei r natural 
vibrating frequency. For high-fidelity speakers, the natural vibrating frequency is often below 
100 Hz, resulting in an objectionable "boomy" coloration to bass notes, shown as the peak in 
the magnitude spectrum plot (figure 8.6b). The purpose of the ducted port enclosure is to com¬ 
pénsate for the natural vibrating frequency of the loudspeaker, to even out its response to low 
frequencies. 

Thesizeof the enclosure and thesizeof theduct are designed so that the ai r i nsi de the enclosure 
vibrates at the same frequency as the loudspeaker. W hen the loudspeaker is soundi ng at its natural 
frequency, i t causes the ai r i n the encl osure to resonate (fi gure 8.6c). B ut, as menti oned, a resonator 
steals energy at its resonant frequency, thereby bleeding away the excess and providing the loud¬ 
speaker system with relatively colorless reproduction at low frequencies (figure 8.6d). Precisely 
how this stealing of energy takes place is discussed in volume 2, chapter 6. 

8.4 TheHarmonicOscillator 

The vibrating systems shown in previous sections all arisefrom the interaction of an elastic forcé 
and an i nerti al forcé. T he el asti ci ty provi des a restori ng forcé whilethei nerti a causes the restori ng 
forcé to overshoot its equilibrium point, thereby extendí ng the vi brati on. Such systems are cal led 
harmonic oscillators. 

To understand mathematically how vibration arises, let's return to the simplest harmonic 
oscillator consistí ng of a mass attached to the end of a lightweight spri ng, suspended from a 
crossbar (figure 8.7). We can characterize the vi bration by analyzing the forces at work on the 
harmonic oscillator through time. We combine Hooke's law, which characterizes the spring's 
restori ng forcé, with Newton's second law of motion, which characterizes the mass's inertial 
forcé, and observe how these forces interact to cause a sinusoidal displacement of the mass 
through time. 



Figure 8.7 

Simple spri ng/mass system. 



C hapter 8 


For a continued discussion of resonance, skip to section 8.9. However, that treatment depends 
upon the interven i ng material. 

8.4.1 Hookevs. Newton 

When discussi ng equation (8.1), Hooke'slaw of linear el asticity, I talked about the statíc forcé F 
required to achieve a spring displacementx if the spring has stiffness k. B ut now we wantto exam- 
inehow the spring forcé would change if we varied the spri ng displacementthrough time, so let's 
consider x as a function of time, x(t). Therefore, we want to study the forcé 

F k = -kx(t), (8.9) 

where F k is the forcé requi red to overeóme to the spri ng's stiffness to achieve a displacementx at 
time t. 

By Newton's second law of motion weknow that the forcé requi red to set a mass in motion is 
proportional to the mass m times its acceleration a. But hereagain we wantto examine how such 
a forcé would change if we varied the mass's acceleration through time. So if we consider a as a 
function of time, a(t), then we wantto study the forcé 

F m = m ■ a(t), (8.10) 

where F m is the forcé requi red to overeóme the mass's inertiam to achieve an acceleration a at 
time t. 

If weapply no external forcé to adangling spri ng/mass System, itwill eventual ly cometo rest 
with the spring displaced downward slightly by the forcé of gravity on the mass. Where it comes 
to rest is its point of static equilibrium. A system is in equilibrium when the sum of theforces 
operating on it is zero. At the static equilibrium point, the forcé of gravity is exactly opposed 
by the spri ng's restoring forcé. The mass is at rest relative to the spring. 

In whatfollows itwill beconvenientto elimínate the effeets of gravity and friction, which we 
can do by imagining the spri ng/mass system vibrating in outer space. 5 Because there is no gravity 
nearby, we must use a b¡ pol ar spri ng— that i s, a spri ng that provi des both a pul I w hen stretched and 
a push when compressed. Imagine one end of this spring attached to the mass and the other end 
attached to a very massive object, such as a space station. Let's suppose that the mass is at rest rel¬ 
ative to the spri ng and exerts no forcé (F m = 0) and the spri ng exerts no counterforce (F fc =0). Then 
F m = F k because both are zero. This system is in static equilibrium because the sum of the forces 
equalszero. 

B ut w e can show that F m = F k even i f the mass i s vi brati ng, that i s, i f the system exhi bits dynamic 
equilibrium. A dynamicai system is one whose State depends upon its previous State (in addition 
toany other forces acting upon it). Forexample, suppose I pulí down on the weight, stretching the 
spring an initial displacementx. The restoring forcé of the spring tugs on the mass with a forcé 
proportional to its displacement. In the first infinitesimal moment after I release it, the restoring 
forcé attempts to accelerate the mass upward, but the i nertia of the mass reaets with a counterforce 



Vibrating Systems 


proportional toitsmass.Thetendency ofamassto resistchange in velocity isitsinertial reactance. 
If there were no inertial reactance, the spring would just snap back to its equilibrium point. But 
instead, during this first infinitesimal momentafter release, the elastic forcé and the inertial reac¬ 
tance sti II balance, andF m = F fc . 

A s the i nertial reactance g i ves w ay, the mass accelerates toward the stati c equilibrium poi nt. B ut 
as it does so, the forcé applied by the spri ng di mi nishes, si nce there i s now I ess displ acement, and 
the spring tugs on the mass with proportionately less forcé. 

When the mass reaches the static equilibrium point, the restoring forcé of the spring vanishes. 
Sincethis meansthe restoring forcéis no I onger accel erati ng the mass, the i nertial reactance of the 
mass al so vani shes at thi s poi nt. T hus here as wel I, F m = F k . H owever, though the mass stops accel- 
erating, its momentum continúes to carry it upward, past the static equilibrium point. Now the 
restoring forcé begins to oppose the upward movement, causing the mass to decelérate by a pro¬ 
portional amount, and F m = F k here as well. 

In summary, the restoring forcé F k grows with increasi ng displ acement from the equilibrium 
point. The farther the spring isfrom equilibrium, the more strenuous isthe forcé itapplies to the 
mass i n order to return to equi Ii bri um; but because the mass's inerti al reactance opposes it, the two 
torces always balance and are in dynamic equilibrium atall times and in all positions. 

8.4.2 E quation of Vibratory M otion 

Starting with F m = F k and substituting appropriate termsfrom (8.9) and (8.10) produces 
m ■ a(t)--k ■ x(t). Expressing this as a dynamic equilibrium yields 

ma(t) + kx(t) = 0, Equatlon ofM otion (8.11) 

wherem ismass, a(t) i s accel erati on at ti me t, k isthe spring constant, andx(t) is displ acement of 
the mass attime t. RecalI that equilibrium meansthatthesum of applied forces equalszero. 

Notwithstanding this intuitive presentation, it would begood to understand how (8.11) can 
cause an oscillatory vibration because it's not immediately obvious just from looking at it. What 
we are trying to discover ¡show displacementxchanges as t varíes, that is, wewantto find an alge- 
braic expression forx(t). Intuition suggeststhat (8.11) should describeasinusoidal motion. Actu- 
ally, to be more precise, it should describe every possible sinusoidal motion, because even such 
a simple spring/weight system is theoretically capable of creating an infinite variety sinusoidal 
motions with different initial phases, amplitudes, and frequencies. They should all be embodied 
in (8.11). That all possible sinusoidal motions are indeed embodied in (8.11) is the subject of 
volume2, chapter 6. 

8.5 M odes of Vibration 

The degrees offreedom of a vibrating system are determined by how many independent motions 
the system can make. There are two kinds of motions: transíational, which is backward/forward, 
left/right, and up/down, and rotational, which involves pitch, yaw, and roll. 6 



250 


C hapter 8 


[ ^ww^ymw^fflw^ l ^ww^ywinpyMw^ 

| ] y TMIíFyiíM^ 


A subway has one translational degree of freedom (backward/forward), a car has two (adding 
left/right),anairplane, three(adding up/down). Adding therotational motions,anairplanehassix 
degrees of freedom (all three translational and all three rotational motions). 

AII thevibrating Systems described in previoussections haveonly one degree of freedom. The 
System shown i n figure 8.8, having two weights coupled with springs, has two degreesof freedom. 
The System in figure 8.8a isjust a variation on the simple System in figure 8.7 and exhi bits similar 
harmonio motion. If thetotal of themass and spring stiffness of the 8.8a system isthesameasthat 
of the 8.7 system, both wi11 vi brate at the samefrequency. B ut even if the mass and spri ng stiffness 
of the Systems in figures 8.8b and 8.8a are the same, the vibrating frequency of the 8.8b system 
will behigherbecausetherestoring forcefrom the spri ngsis three ti mes greater. Thus, if the radian 
frequency for the 8.8a system is o»! = Jk/m , the radian frequency for the 8.8b system is 
co 2 = J3k/m . This method of analysis of vibration was introduced about 1727 by Johann 
Bernoulli (1667-1748). 

T hese i ndependent vi brati onal modes are sometí mes cal I ed normal modes or natural modes. For 
each mode, each element of thevibrating system reaches its position of máximum displacement 
from equilibrium at the same moment. Though thevibrating modes of the 8.8a and 8.8b Systems 
are vi rtually ¡ndependent, it is difficult to get a system to vi brate i n j ust one or the other mode with- 
outvery carefully positioning the bal Is before releasing them. Ordinarily, the vibration will be a 
combinad on of the two modes. 

The system shown in figure 8.8 has only two normal modes because it has only two degrees 
of freedom. A system with A/ degreesof freedom will have/V normal modes. Wecould add more 
weights and springs i n orderto study Systems with N degreesof freedom and therefore/V modes. 
Or, we could just increase the number of dimensions of the system from oneto two by allowing 
transverse vibration. Thevibrating system i n figure 8.9 has four degrees of freedom—the two 
for the figure 8.8 system plus two more— because now each ball can movein two di rections. In 
general, if the number of masses in a vibrating system isa, and they can each move in b di rec¬ 
tions, then the number of degrees of freedom N = ab. 

Each normal mode has its own characteristic frequency made up of some combination of 
the average contributions of all the masses in the system and the average contributions of the 




Vibrating Systems 


251 


a) b) 



Figure 8.9 

Transverse and longitudinal vibration. 



Figure 8.10 

Superposition of normal modes. 

stiffness in the springs. The resulting motion of the entire System can be characterized as a 
superposition of all the sepárate vibrational modes. Figure 8.10 shows the superposition of the 
normal modes of the Systems in figures 8.8a and 8.9b. Since the modes are virtually indepen- 
dent, and each has its own vibrational frequency, the spectrum offrequencies ofthe entire Sys¬ 
tem is the linear combination (the sum, or mixture) ofeach mode. If sound radiates from such 
a vibrating system, we hear the sum total of all frequencies of each of the vibrating modes. 
The strengths of these frequencies is proportional to the amount of energy in each mode, 
separately. 

8.6 A Taxonomy of Vibrating Systems 

There are many elassificatión Systems of musical Instruments, such as the traditional categories 
of brass, strings, woodwinds, and percussion. Another el assifi catión system organizes them as 
idiophones (chimes, cymbals, xylophone, vibraphone, marimba, gongs), membranophones 
(drums), aerophones (ilutes, oboes, clarinets, trumpets, tubas, whistles, sirens), and chordo- 
phones (violin, piano, guitar, harpsichord). If we group Instruments by the si mi I arity of 
the fundamental equations governing their vibration, we obtain the simple taxonomy shown in 
table 8.1. 

Tensi on i s the pri mary restori ng forcé for stri ngs and membranes, and frequency i s proporti onal 
to tensión. Stiffness is the restoring forcé for bars, air columns, and plates, and frequency is pro¬ 
porti onal to stiffness. 

There are many subgenres for these examples. Bars can be free at both ends or free at only 
one end. Plates can be clamped at the edge, supported at the edge, supported at the center, or 
total Iy free. 



252 


C hapter 8 


Table8.1 

SimpleTaxonomy of M usical Instruments 


Dimensión 

Restoring 

Forcé 

Vibrating 

Element 

Taxonomy 

1-D 

Tensión 

Strings 

Chordophones 


Stiffness 

Bars 

1-D idiophones 



Air columns (brass, woodwinds, ilutes) 

Aerophones 

2-D 

Tensión 

M embranes (drums) 

M embranophones 


Stiffness 

Plates (gongs, cymbals) 

2-D idiophones 


B ars, plates, and stri ngs can vi brate unhi ndered or siap agai nst a surface. T he saxophone reed 
is a bar fixed at one end, which slaps agai nst a mouthpiece. Sitar stri ngs slap agai nst a sloping 
píate attached to the bridge in order to create the characteristic "sizzle" sound. The bottom 
membraneof a snaredrum slaps agai nst an array of coiled wires laid across itto lend it a char¬ 
acteristic "crunch" sound. In all cases, the resulting spectrum contains much more energy 
in higher partíais because of the discontinuity in simple harmonic motion that the slap 
introduces. 

Traditional musical instruments are madefrom collections of these elements. For example, the 
essential elements of a saxophone are a bar and an air column; the essential elements of a piano 
are stri ngs and a píate (the sounding board). 

A11 taxonomi es are necessari ly reducti onist; this one is, too. For exampl e, the stri ngs of a vi ol i n 
actually vibratein atleastfourdimensions: up/down,front/back, longitudinal (endtoend), andtor- 
sional (twisting) vibration. These motions of the strings all affect each other. Also, an important 
distinction between instruments is whether they are continuously driven (e.g., violins, voice, 
woodwinds, brass) or i mpulsively driven (e.g., piano, harpsichord, guitar, percussion). The advan- 
tageof this taxonomy issimply that itallows ustogroup si mi lar instruments together by thebasic 
physical equationsthatgovern their vibration. 

8.7 One-Dimensional Vibrating Systems 

According to the taxonomy of instruments i n table 8.1, the vibration formulas for stringed instru¬ 
ments and bar percussion instruments are el osel y related. Thisseemscounterintuitive. If they are 
related mathematically, why do they sound so different? For example, few would mistake the 
sound of a xylophonefor that of apiano, even though thepiano'sstringsarealso struck. The piano 
and other stri nged i nstruments made f rom I ong thi n w i res have I argel y harmoni c spectra, w hereas 
percussion instruments generally have inharmonic spectra. 

I f the f ormu I as for thei r vi brati on are tornean anythi ng, they mustaccountforthevastdifference 
i n timbre. The ai m of thissection isto demónstrate the underlying symmetry of one-dimensional 
vibrating systems. 



Vibrating Systems 


8.7.1 Strings 

Stringed instruments can beclassified by 

■ How they aresounded: 

Bowed: the violinfamily 

Plucked: guitar, mandolín, harp, harpsichord 
Struck: piano, hammered dulcimer 

■ How they select pitch: 

Unstopped strings: harp, piano, harpsichord 
Stopped fretted: guitar, mandolín, lute 
Stopped unfretted: violin family 

■ Whethertheirsound can be continuously produced: 

Continuously driven: all bowed strings 

Impulsively driven: all plucked and struck strings 

The piano and harpsichord provi de an array of strings tuned to consecutive scale degrees, and music 
isplayed by selecting theappropriatestring.Theguitar, lute, violin, and mandolín haveasmallerarray 
of strings tuned to nonconsecutive scale degrees, and they provi dea fingerboardunderneath the strings 
so the player can sound the pitches in between adjacent strings by stopping off different lengths. 

Thefingerboard on the violin family (violin, viola, cello, and bass viol) isa smooth surface so 
that any pitch in the continuous pitch space covered by the string may be selected. Guitar, lute, 
banj o, and mandol i n have frets— transverse bars across the fi ngerboard under the stri ngs— so that 
when stopped by the f i nger, the I ength of the stopped string isdetermined by the f ret. F rets provi de 
an improved ability to stop múltiple strings with correctintonation. 

Continuous pitches may be produced by sliding the fi nger along the string of a violin, an effect 
cal led glissando. Sliding the finger along the string of a fretted i nstrument produces a series of dis- 
crete pitches, an effect cal led portamento. 

Stri ngs are stretched betw een ri gi d supports w i th a means of adj usti ng thei r tensi on. I n vi rtual I y 
every stringed ¡nstrument, energy is injected into the string transversely— perpendicular to the 
string— and transverse motion carries the majority of the energy. Because strings are typically of 
very low mass and do not displace much air, they arealmostalways coupled to theairthrough a 
sounding board, such as the wooden back of a piano orthe body of a violin, mandolín, or guitar. 
The sounding board allows the energy in the string to betransmitted efficiently into thesurround- 
ing air by matching theimpedanceof the string to the air. Withoutthe sounding board, wewould 
hear very littlefrom a stringed ¡nstrument. For example, a strummed unamplified electric guitar 
isvirtually inaudible a few feetaway, whereas the sound from an acoustic guitar can fill an audi- 
torium. 7 The difference is that the acoustic guitar matches string impedanceto air, and the electric 
guitar does not (see volume 2, chapter 8). 



254 


C hapter 8 


B owed i nstruments produce a conti nuous tone by repl aci ng energy i n the stri ng as it is dissi pated. 
A skilled playercan sustain continuity by instantaneously reversing thedirection of thebow when 
theend is reached. Playersof impulsively driven ¡nstruments, such as the mandolín, cancreatethe 
¡Ilusión of a sustai ned tone by rapidly repl ucki ng the stri ng, an effect cal I ed tremolo. B owed i nstru¬ 
ments can al so be plucked, an effect called pizzicato. 

U nless they are being played with tremolo, all impulsively driven stringed ¡nstruments decay gradu¬ 
al I y to sil ence f rom note onset. The rateatwhichthey decay to silence vari es enormousIy.Theefficiency 
with which an instrument radiates energy determines its rateof decay (see section 4.19.2). The banjo is 
perhaps the mostefficient stringed instrument, radiating away all its energy in afew seconds. Attheother 
extreme, the bottom notes of a piano can sustain for several minutes (with thedamper pedal down). 

In thefollowing subsectionsl presentthevibration of ideal str/'ngsthatare perfectly flexible, have 
constant mass per unit length, and are connected to massive, nonyielding supports. Atfirst, I wiII 
ignore the effects of dissipation on the vibration of strings. However, all stringed ¡nstruments 
depend upon dissi pation to carry sound energy into their surroundingswhereitcan be heard. Ten¬ 
sión, not stiffness, isthe restoring forcé of the ideal string. Of course, all real stri ngs have somestiff- 
ness, and stiffness isthe hidden I i nkthat reí ates stringed ¡nstruments to bar percussion ¡nstruments. 

String M odes Strings can be usefully studied as many tiny spring/mass Systems concatenated 
together, similar to those shown in figure 8.9. Since the number of possi ble vi brating modes is 
large, for simplicity weconsiderjustthefirst five modes available when the number of degrees 
of freedomW =5 (figure 8.lia). F i gure 8.11b shows the correspondí ngfirst five modes of the infi¬ 
nite number of degrees of freedom of an ideal string (N = °°). 

For each mode, the points where the string crosses the equilibrium are called zero-crossings, 
points ofinflection, or nodes of that mode. S i nce the stri ngs are fixed at the ends, the ends are nodes 
aswell. Nodes are pivot points around which the string vibrates. For each mode, the points where 
the string isfarthestfrom equilibrium are called maxima, points of máximum excursión, or anti nodes 
of that mode. 



EquiliW» -J^j\ -J\^ - 


Figure 8.11 

M odes of transverse vibration. 



Vibrating Systems 


Standing Wavesand Traveling Waves In an ordinary médium such as air, sound propagates 
as traveling waves. But the rigid boundaries at the edges of a string cause most energy to be 
reflected back into the string, and prevent itfrom radiating away (see section 7.8.4). 

The shapes of the modes shown in figure 8.11b are called standing waves because the shape of 
the string remains the same at all moments and only its amplitude changes. (M ore precisely, the 
height of the wave is scaled through time in the direction perpendicular to its length.) 

The behavior of a standing wave can best be described as thesum of two waves traveling in 
opposite di recti ons. I magi netwo waves, y x and y 2 , moving through eachotherfrom oppositedirec- 
tions along a string. Their combined displacement, y = y 1 + y 2 , creates a standing wave. 

To demónstrate this requires some trigonometry. Let the traveling wave moving to the right be 
represented as the sinusoid y^x, t) = A si n(kx + cof) and the one traveling to the Ieft as y 2 (x, t) = 
A sin(kx- cot), where t is time, co is radian frequency, x is displacement of the wave from its 
origin along the direction of travel, k is the rate at which the displacement grows, and A is 
amplitude. 

To see how this represents a traveling wave, we reason asfollows. If wesetkequal to zero, then 
As\n(kx+(üt) reduces to 4 sin cot, which plots anordinary sinewavewithazero-crossingattheori- 
gin. But if k is nonzero, then asx grows (because the wave is traveling), the zero-crossing atthe 
origin moves away from the origin with velocity k. 

N ow let's return to the standing wave on a string. To see how two oppositely moving traveling 
waves combine into one standing wave, consider the following trigonometric identities (see 
volume2, appendix): 

sin(a + b) = sin(a)cos(¿>) + cos(a)sin(b) 
sin(a - b) = sin(a)cos(b) - cos(a)sin(b). 

Suppose we let a = kxand b = cot; then wecan representthetwo sinusoids as follows: 
y 1 = sin(a)cos(b) + cos(a)sin(b) 
y 2 = sin(a)cos(b) - cos(a)sin(b). 

Summing the two sinusoids, we have 

y = yi + y 2 = 24s¡n(a)cos(b) (gl2) 

= 24sin(kx)cos(cof). 

Equation (8.12) shows the product of two sinusoids. Its plot is a standing wave that is 
the point-by-point sum of the two signáis, y 1 and y 2 , as they pass through each other. Figure 8.12 
shows string mode 4 at several phasesand the location of the nodes and antinodesof the string. 

ModeWavelengths Consider the wavelength of the first mode, the fundamental (figure 8.11b). 
Since this modeoutlines half of a si ne wave, if the string length isL, then onefull period of its 



256 


C hapter 8 



Figure 8.12 

String mode 4 as a standing wave. 

wavelengthX 1 = 2/..Onefull periodofmode2fitsexactly ¡nlengthL,sowecanwr¡teÁ. 2 = /..And, 
i n general, we can w rite 

X n = —, ModeLength (8.13) 

n 

wher en = 1, 2, 3,.... 

ModeFrequencies For an ideal string, the velocity of a transverse wave is the same for all 
modes because the stiffness doesn't increase with the mode number. (This is not true for real 
stri ngs.) I f the vel ocity of transverse waves on an i deal stri ng i s v t , then we can express the reí ati on 
between frequency f, wavelength X, and velocity v t as X = v t lf, or f= v t IX. Using the definí ti on for X n 
from (8.13), we can express the frequency of mode n as 



Because the i deal string has no stiffness, v t dependsonly on the stri ng's mass per unit length m 
and its tensión T, so that 

ym 

In a string, tensión T takes the role of elasticity in a harmonio oscillator. 
can express the frequency of string mode n as 

strlng 

wheren = 1, 2, 3,... . 8 

8.7.2 Longitudinal Bars 

In the preceding section, we considered the case of the ideal string, which contains tensión but 
no stiffness. The bar vibrating longitudinally, by stretching and shrinking its length, is the other 


Putting it all together, we 
Mode Frequency (8.15) 



Vibrating Systems 


257 


limiting case because ¡t is under no tensión; its restoring forcé is entirely due to its stiffness. Its 
vibrating frequency equation isvery similar to that of thestring. Thefrequency f of moden is 

fn = ujp' Longitudinal Bar (8.16) 

where V is Young's modulusof elasticity, p isthemass density of the material, L is the length of 
the bar, and n = 1, 2, 3,_ 

According to (8.16) the modes of the longitudinal bar are in a harmonio frequency series, like 
strings. The longitudinal vibration modes of a bar are usually very much higher in frequency than 
the correspondíng transverse vibration modes of the same bar. H istorically, longitudinally vibrat¬ 
ing barshavebeenused astuningforksforfrequenciesabove5000 Hz, wherethetraditional tuning 
fork design is no longer satisfactory. Some modern instruments use this vibration mode, for 
instance, by stroking a Steel rod with a rosin-coated cloth to excite longitudinal vibration modes. 

Figure 8.13 shows the direction and magnitude of movement of the first three modes of a 
longitudinal bar. 

Young’s M odulus Theforces needed to stretch a solid object depend upon thefollowing factors 
(figure 8.14): 

■ Amountof stretch Fortwo ¡dentical rods (figure 8.14a), proportionately more forcé is requi red 
to stretch one rod further than the other. 


a) Fundamental A =Antinode 



b) First harmonio 



Figure 8.13 

M odes of a longitudinal bar. (Adapted from Olson 1952.) 










C hapter 8 


a) A mount of stretch b) C ross-sectional area c) L ength of the rod 

<-n r ^ ,-P cnT^ 1 


Stretching solid rods. 


Table 8.2 

Young's ModulusforSelected Materials 


Polyethylene 

0.2-0.7 

x 10 10 

Brass 

103-124 

x 10 10 

Wood 

0.6-1.0 

x 10 10 

Titanium 

110 

x 10 10 

Nylon 

2.0-4.0 

x 10 10 

Castiron 

83-190 

x 10 10 

Aluminum 

69-79 

x 10 10 

Steel 

190-210 

xlO 10 


■ Cross-sectional area For rods of ¡dentical material and length but different cross-section, the 
amount of forcé required to stretch the thicker rod wi11 be proportionately greater (figure 8.14b). 

■ Lengthofrod For rods of i dentical material and cross-section but different length, theamountof 
forcé required to stretch the shorter rod will be proportionately greater (figure 8.14c). 

These observadons can be combined as follows: 

F18171 

where L 0 is the original length of theobject, AL is the increase in length, A is the cross-sectional 
area, and Y is a constant of proportionality called Young's modulus. 9 Young's modulus isth eratio 
of stress to strain of a material. Its valué depends upon the nature of the material. Solving for Y 
in terms of the units involved shows it is measured in pascáis (forcé per unit area, N/m 2 ). 

Notethatequation (8.17) isvalid only if theamountof stretching is relatively small compared 
to the original length of theobject becauseitonly appliesto linear el asticity (see section 8.1.3). 

Table 8.2 is a short listof Young's modulus for various materi ais. Young's modulus vari es a great 
deal fromonesampleto the next, dependí ngon the purityof the sampleand its manufactu ring process. 

8.7.3 Transverse Bars 

Transverse vi brati on can occur where a bar i s el amped at one end or is free at both ends. B ars free 
at both ends are used in instruments such asthexylophone, marimba, vibraphone, glockenspiel, 



Vibrating Systems 


a) b) c) d) 



Figure 8.15 

Modesof atuning fork. 


and celeste. Barsfixed atone end, al so cal led cantilever beams, are thekey vibrating elements in 
the harmonium, accordion, jaw harp, and some organ reed stops. 

The vi brati ng frequency of a longitudinal bardependsupon itslength, density, and elasticity, but 
in transverse vibration frequency also depends on the thickness and cross-sectional shape of 
the bar because this has a direct effecton the transverse flexibility of the bar. Addi ti onal ly, trans¬ 
verse bars can twist, creating torsional modes. 

Cantilever Beam To study vibration of transverse bars, suppose we take a springy Steel wire 
with relatively little mass, and stick its base into a rigid surface, then attach a mass to thefreeend 
(figure 8.15a). When pulledto thesideand released, a coherent vibrating movementoccurs over 
theentirelength of thespri ng; hencethisismode 1 vibration, which produces thefundamental fre¬ 
quency. Givenastiffnesskand mass m, equation (8.4) determines the vibrating frequency. 

Now attach half of the mass at the end and half in themiddleof thespring (so that, overall, the 
mass is the same). Some energy will vibratemodel (figure 8.15b), but somewill víbrate mode 2 
(figure 8.15c). Because mode 2 flexes the spring much more than mode 1, the spring constant k 
for mode 2 is higher, making the vibrating frequency of mode 2 a noninteger múltiple of the fre¬ 
quency of mode 1, producing a nonharmonic partial. M ode 2 vibration can be about six times 
higher in frequency than mode 1, corresponding to an increase in the spring constant by a factor 
of about 18 (since mass is the same overall). 

Now wedistributethemassin thirds (figure8.15d), allowing usto energizemode3.Theamount 
of flexing that the spring undergoesfor mode 3 vibration iseven greater, so thespring constant k 
for mode 3 is even larger. M ode 3’s frequency is approximately 18 times higher than mode l's, 
corresponding to an increase in k by a factor of about 186. 

Forevery additional massadded to the wire, we more closely approximatean actual bar. Olson 
(1952) gives the equation for the fundamental frequency of a cantilever beam as 

f 0.5596 IyF 


Cantilever Beam (8.18) 



C hapter 8 


Table8.3 

M odes and Frequenciesof Fixed/FreeBar 


Partial 

N o. of N odes 

N ode D istance from F ree E nd 

Partíais 

Example 

Frequency 

1 

2 

0 

1 

0.2261 

f x 

6.267^ 

33.83 

212.00 

3 

2 

0.1321, 0.4999 

17.55/i 

593.69 

4 

3 

0.0944, 0.3558, 0.6439 

34.39/=! 

1163.36 


where L is the length of the bar in meters, p is its mass density in g/cm 3 , Y is Young's modulus, 
and K is the radius of gyration. 

For a bar of rectangular cross-section, Olson gives the radius of gyration as /C = aij\2, where 
a is the thickness of the bar i n the di rection of vi bration. (W idth doesn't matter because we are not 
consideringvibrationacrossthethicksideoftherectangle.) For circular cross-section, Olson gives 
K = a/2, where a is the radius of the bar. If the cross-section is hollow, Olson gives 



wherea 0 istheoutsideradi us of the pi peand a, isthei nsi de radius. H ethen gives partial frequencies 
(table 8.3). 

For example, supposewehavea bar 0.5 m long, rectangular in cross-section, madefrom alumi- 
num that is 10 mm thick. Young's modulus Y ~ 74 x 10 9 Pa for aluminum, 10 the mass density 
p = 2.7 x 10 3 kg/m 3 , thickness a = 0.01 m, length L = 0.5 m, and because the bar is rectangular, 
K = 0.01/VÍ2. Plugging these valúes into (8.18) for n = 1,2,3,4,5 yieldsa fundamental and partials 
shown in thelastcolumn of table 8.3. 

Bar with FreeEnds Rossing (1983) supplies thefollowing function for a bar with free ends: 

f n = m 2 ^-l^, Bar with Free Ends (8.19) 

81 2 a/p 

where/C,/., Y, and p arethesameasdefined in (8.18). The parameter m needs a bitof explaining. 
Rossing writes, "The frequencies of the modes are in proportion to the squares of the odd 
integers— almost. Thenumberm begins with 3.0112 and then continúes with the simple valúes 

5,7,9.(2n + 1)." 

Wecan describe the valúes for m as follows: 


J3.0112, n =1 

|2n +1, n > 1 


for n = {l,2,3,...}. 


For example, using the aluminum bardescribed in thesubsection on cantilever beam and plugging 
those valúes into (8.19) for n = 1,2, 3,4, 5 yields the frequencies shown in table 8.4. 



Vibrating Systems 


261 


Table8.4 

Bar FreeatBoth Ends 


Partí al 

Frequency 

Ratio 

1 

215.25 

1.00 

2 

593.48 

2.75 

3 

1163.21 

5.40 

4 

1922.86 

8.93 

5 

2872.42 

13.34 


The ratios of the partíais in tables 8.3 and 8.4 are strongly inharmonic. Nonetheless, bars such 
as these are used for pitched Instruments like glockenspiel, chimes, and orchestral belIs. This is 
possi ble because the higher partíais die out quickly (during the initial clangtone), leaving thefirst 
pardal by itself as a relatively puré tone. Also, the inharmonic higher partíais of some of these 
i nstruments are wel I beyond the range of human heari ng. 

M akingTransverse Bars Have M ore Harmonic Spectra Themarimba, xylophone, and vibra- 
phone are made from bars free at both ends, suspended over resonad ng tubes. The bars are thinned 
in the middle so asto bring thefirst two partíais into a harmonic relahon. Here's how it works. 

Thinning the middle of a bar has theeffect of reducing thestiffness of justits model vibradon. 
(It also slightly decreases the mass of the bar, which slightly raises its pitch, but the decrease in 
stiffness is the more important effect.) The result is that thefrequency of the first mode is lowered 
reí ati veto the others, which are largely unchanged. M arimba bars are thinned enough so that the 
reí ati onf 2 /f 1 goes from about 2.75/1 to 4/1. Xylophone bars are thinned less, so the ratio f 2 /f 1 = 3/1. 
Thus, f 2 ¡s an octave and a fifth above f x . The 3/1 ratio accounts for the prominence of the sound 
of a musical fifth in thexylophone’s timbre. 

Each bar of the marimba, xylophone, and vibraphone is also equipped with a resonahng tube 
placed below it to amplify and draw out the fundamental pitch (at the expense of shortening the 
bar's vibration time because resonance steals energy from the bar at this frequency). The vibra¬ 
phone also hasan electric motor that rotates paddles within each tube. They lookjustlike rotad ng 
dampers in a stove pipe. The paddles cut off the energy supplying the resonator tubes, giving a 
tremolo effect (periodic amplitude fluctuation plus a small periodic fluctuation in pitch) as they 
roíate. The speed of rotation can be varied by a control on the motor, and the motor can also be 
switched off. A n i nteresti ng additional consequenceofthefluearrangementonthevibraphoneres- 
onators isthattones last longer, on average, when the paddles rotatethan when they areopen: less 
energy is radiated from the bars when the resonators are blocked. Therefore the energy lingers 
longer on average in the bars when the paddles rotate. 

8.7.4 Stiffness of Strings and I nharmonicity 

Wesaw in the discussion of transverse bars that stiffness increases in higher modes, stretching the 
upper partíais of these ¡nstruments. The same is true for strings, especially thick strings that 



C hapter 8 


increasingly resemble transverse bars the thicker they become. Let's return to the discussion of 
strings and consider the effects of stiffness on nonideal strings. 

Whiletheideal string vibrates in a series of modes that are perfectly harmonic, actual strings have 
some ¡nternal stiffness, so they are not perfectly elastic. Thus there are actually two restoring forces 
in a stri ng: tensión and stiffness, and the vibrating frequency of each mode in a string is determined 
by both. According to equation (8.15), tensión affects all modes equally. H owever, stiffness provides 
proportionally greater restoring forcéfor the higher modes because the higher the mode, the more 
the string is bent. Therefore higher-numbered modes undergo progressively greater amounts of stiff¬ 
ness. A nd since frequency of a vibrating string is proportional to stiffness (and tensión), an increase 
i n stiffness causes an i ncrease i n frequency. Thus the frequencies of the modes of a stiff stri ng spread 
out in frequency and are no longer exact múltiples of the fundamental. The stiffer a string, the less 
itacts like a string and the more itacts likea bar, according to thetaxonomy in table 8.1. 

CaseStudy: The Piano The rangeof frequenciesa piano must reproduce isfrom about27 Hz 
to 4000 Hz, aratioof morethan 1:100. If weused stri ngs of the same tensión and mass, and if the 
highest-pitched string wereonly 4 in. long, the lowest stri ngs would havetobewell over 33 ftlong. 
Clearly, real pianosaren'tthat enormous.Why? Equation (8.15) suggeststhattheonly parameters 
affectingthefrequency of a vi brating string are length, tensión, and mass perunitlength.lfwewant 
to shorten the bass strings, then we must either decrease their tensión or increase their mass per 
unit length, or somecombination of both; or play some other tricks in combination with these. 

Wecould shorten the bass strings if welowered their tensión. But piano strings sound bestwhen 
they are cióse to their máximum tensión so that they produce a bright and long-lasting tone. Sowe 
can makeonly minor adjustments in tensión. 

Wecould shorten the bass strings if we madethem thicker. But then they would become more 
like transverse bars: their overtones would become stretched and they no longer would have 
strictly harmonio spectra. 

A ctual ly, the probl em is not so much that bass stri ngs would have i nharmoni c spectra. B y i tself, 
a string with mildly stretched overtones sounds pretty good. I n fact, studies have shown that musi- 
cians and nonmusicians seem to prefer strings with slightly stretched overtones. The real probl em 
i s that the stretched overtones ofbass strings do not Une up with the fundamentáis of strings tuned 
to the higher octaves of these bass strings. Ideally, we'd like the overtones of the bass stri ngs to 
lineupexactly with the fundamental sof thehigher-pitched strings, but they don't because of their 
stiffness. 

Piano makers have employed a variety of strategies to work around this probl em. For instance, 
since thinner strings have less stiffness, they use múltiple thinner strings struck simultaneously 
instead of one thick string. They also wrap wire around strings to increase their mass. Since the 
string insi de the wrapping is relatively thin, overtones are not stretched as much in these strings 
as woul d be the case w i th a sol i d stri ng of the same thi ckness. B ut for compact pi anos such as spi n- 
etts, w here the bass stri ngs must be very short, overtone stretchi ng i s a seri ous chal I enge to tuni ng 
the instrument. Infact, harmonio stretching is a problem even forgrand pianos with thelongest, 
thinnest strings. This is just a fact of life for piano tuners. 



Vibrating Systems 


Thework-around employed by piano tuners isto tune the higher-pitched strings progressively 
sharper so that the harmonios of the lower strings more or less line up with the fundamentáis of 
the higher strings. Spinets, which by design must have theshortest, thickest bass strings, require 
the greatest Progressive sharpening to blend away the significant overtone stretching of the bass 
notes, whereas concert grand pianos require the least because they can have longer strings. 

8.7.5 AirColumns 

An aircolumn by itself can neverbeanything morethan aHelmholtz resonator, vibrating in sym- 
pathy to a sound caused by another source, so it must be coupled to a sound-producing source, 
which can be anything that vibrates (table 8.5). 

Modesof Vibration Vibration of an aircolumn occurs because of longitudinal displacementof 
air particles. There are two forms of air columns: those open at both ends, and those open at one 
end only. Additionally, the profile of the pipe may be cylindrical or conical. 

Recall that a nodeis a pointwheredisplacementdueto vibration iszero,and an antinodeisa point 
where displacement due to vibration is greatest. At the open end of a pipe, there is a displacement 
antinode because the air i nsideisfreetomove i n and outof thetube. A ttheclosed end of apipe, there 
is a displacement node because the air can’t move longitudi nal ly (theclosed end prevents it). The 
vibration modesof aircolumns can befound quickly using the same approach wetookfor strings. 

Pipe Open atBoth Ends Clearly, air is freeto víbrate i n and outof the ends of a pipe open at 
both ends. T hat means a pi pe open at both ends can only support modes that have di spl acement anti- 
nodesat both ends. Thefi rstfour displacement modes are shownin figure8.16. The figure i ndicates 
how much parti el e di spl acement i s possi bl e at each position al ong the I ength of the pi pe. T he actual 
particlemotion in an aircolumn is the same as shownfor longitudinal bar vibration in figure 8.13. 

Table 8.5 

Air Column Instruments 


Bar Xylophone, marimba, and vibraphone; somepipeorgan ranks; many automobile 

horns; some enclosed-reed mouth-blown instruments such as the crumhorn 
L i ps and bar W oodw i nds; j aw harp 

Loudspeaker Ducted-port loudspeaker endosure 

L i ps and mouthpiece B rass i nstruments 

Fipple Recorder, pennywhistle, most pipeorgan ranks 

Lipsandfipple Flutesandfifes 


M ode 1 M ode 2 



Mode3 







Figure 8.16 

Displacement modes of open-ended pipe. 



C hapter 8 


M ode 1 M ode 2 



Figure 8.17 

Displacement modes of closed-ended pipe. 

Thewavelengthof moden isA. n =2/./n,whereí. isthelengthofthetube.Thereforethefrequency 
of mode n of a pi pe open at both ends is 

= = n = 1-2,3,_ Frequency Modes of a Pipe Open at Both Ends (8.20) 

Equation (8.20) isaslight si mpl ification because the effective length of thetube is actually a little 
longerthan its physical length. Theair i nthecolumn recruits ai r nearthe end of thetube intoitsvibra- 
tion pattern, and an end correction scaling must be applied to obtain a reasonable estímate of the 
effective length. The end correction depends upon the geometry of the opening (see section 8.3.3). 

PipeClosed atOne End A pipeclosed atone end can only support modes thathave displace¬ 
ment anti nodes at the open end and di spl acement nodes at the el osed end. T he f¡ rst four are show n 
in figure 8.17. 

■ Model isonequarterof asinewave, soT.^4/.. 

■ M ode 2 is three quarters of a sine wave, so X 2 = 4L/3. 

■ M ode 3 isfive quarters of asinewave, so A. 3 = 4L/5. 

■ M ode 4 is seven quarters of a sine wave, so X 4 = 4Í./7. 

Extracting the pattern, we see that the wavelength 

X n = — , n odd. 
n 

Thus, the closed-ended pipe only exhibits odd harmonios, and ¡t sounds an octave below an 
open-ended pipe of the same length. The equation forthe modefrequencies of the closed-ended 
pipe is 

f„ - nodd. Frequency Modes of a PipeClosed atOne End ( 8.21) 

This is al so a slight si mpl ification because of the need for an end correction. 

From this we can explain why a clarinet sounds an octave lower than aflute in spite of being 
approximately the same length: the fl ute functions as a pipe that is open atboth ends, whereas the 
el ari net i s el osed at one end. 11 T he same fact expl ai ns w hy the spectrum of a fl ute i ncl udes al I har- 
monics, whereas that of the clarinet contai ns only odd harmonios. Differences in their harmonic 



Vibrating Systems 


spectraalso accountforwhathappenswhen they are overblown.Thefluteoverb/owsatthe octave: 
it sounds eight diatonic steps above standard fingering. The clarinet overblows at the twelfth: it 
soundstwelvediatonic steps (an octave plus afifth) above standard fingering. 

Tubewith Conical Bore Theboresof fluteand clarinet are both approximately cylindrical and 
are approxi mately the same length. The oboe, bassoon, and saxophone have approxi mately coni cal 
bores and are all closed at one end. 

Since the oboe is about as long as a flute but closed at one end, we might naively predict that 
the oboe should, Iike the clarinet, be able to play an octave below the flute. But, in fact, the oboe 
and flute have about the same bottom pitch. W hy? 

Thesimpleanswerhastodowith the physics of the conical bore of the oboe. I n cyl indrical tubes 
sound propagates as a vi rtual ly pl ane wave. (T he smal I er the bore, the more i t i s I i ke a pl ane wave, 
but as the di ameter gets large in comparison to the length, the waves start to become more spher- 
ical.) If we ignore the very smal I effectof ai r absorption of the sound along thetube, the ampl itude 
of the signal is relatively constant along its length. 

But because sound spreads out as it moves toward the open end of a cone, we must take into 
accounttheeffectsof attenuation of the signal is it travels toward the open end. Recal I from equa- 
tion (4.36) that i ntensity / falls off with the square of the distancer, so / = 1 Ir 2 . Recall al so that 
amplitudes is proportional to the square rootof ¡ntensity, soS = J¡ .Therefore ampl itude di mi n- 
i shes as 1 Ir along the insi de of a cone. 

Conical tubes, like cylindrical ones closed at one end, must have a displacement node at the 
el osed end and a di spl acement anti node at the open end. B ut the wavel engths that fit must take i nto 
account the 1/r ampl itude scaling (figure 8.18). For a tubewith conical bore of length L, the wave- 
Iengths that fit are the sinusoids 


forn = 1, 2, 3,... because they all have a nodeatr =0 and an anti node atr =L. (To understand 
the node, thi nk careful ly about the val ue of thi s expression as r goes to zero; to understand the anti- 
node, thi nk about its valué as r goes toL.) Becauseall n sinusoidsfit, thespectrum contains all har¬ 
monios. Because the wavel ength of the conical bore’s fundamental is 2L, its fundamental pitch is 
the same as a cyl i ndrical bore of the same I ength. T he si I ver fl ute and oboe are approxi mately the 
same length; the bottom note of the sil ver flute is C4, and the oboe’s bottom note is a half-step 
lower, B[,3. 


M ode 1 M ode 2 M ode 3 M ode 4 



Figure 8.18 

Harmonio pressure waves of a conical bore. 



C hapter 8 


8.8 Two-Dimensional Vibrating Elements 


Thereare many musical ly interesting vibrating surfaces, including stretched membranes and plates. 

Stretched membranes such as drums are the two-dimensional equivalent of the stretched string, 
wheretherestoringforcédependsupontensión. Liketheideal string,theideal membraneisinfinitely 
flexible, infi nitely thin in cross-section, and uniformly stretched by a forcé sufficiently massive not 
to be affected by the motion of the membrane. The overtone series of stretched strings is harmonic, 
but stretched membranes have inharmonic spectra. Besides many percussion Instruments, Instru¬ 
ments with stretched membranes includethe resonatorfor banjos and the H industani sarodandesraj. 

Plates are the two-dimensional equivalent of the transverse bar, where the restoring forcé 
depends upon inner stiffness of the píate material. W hereas stretched membranes must always be 
fastened at the ri m, pl ates can be el amped at the edge, supported at the edge, supported at the center, 
orcompletely freeto vibrate. A piano sounding board can bethoughtof as a píate supported atthe 
edge. A cymbal is a píate supported at the center. Although analytic Solutions for arbitrary 
two-dimensional geometries can certainly be derived, this section focuses on circular shapes. 

Bothconcentricand radial vibrationmodesare possi blefor circular vibrating elements. Circular 
modal geometries are traditionally classified by two numbers, thefirst indicating the number of 
radial nodes, and the second i ndicati ng the number ofconcentric nodes (always i ncl udi ng the node 
at the ri m). G eneral membrane vi brati on modes are show n i n order of i ncreasi ng modal f requeney 
in figure 8.19. This classifi catión applies to circular stretched membranes and al so to circular 
plates clamped or supported at the edge because these Systems always have a node at the ri m (they 
are clamped). H owever, itdoes not apply to circular plates supported at the center or free because 
these Systems have an antinode atthe rim. 


M ode 03 



fg =3.604 



12 



f 6 =2.92fi 


22 



4=3.54 


32 



f n =4.064 


M ode 01 



4=1.0 



4 = 1.594 4 = 2.144 4=2.654 


41 



4 = 3.164 



Figure 8.19 

M odes of two-dimensional vibration. 



Vibrating Systems 



Figure 8.20 

Bessel function of the first kind, order 0 ,] 0 (x). 

8.8.1 Bessel Functions 

11 can be demonstrated that the contour of the surfacefor each mode shown in figure 8.19 is given 
by a Bessel function of the first kind. There are families of these Bessel functions ("of the 
first kind," "of the second kind," etc.). Each family is made up of functions of integer orders. 
Figure 8.20 shows a Bessel function of the first kind, order 0 in the range -15 <x< 15. Bessel 
functions of the first kind are traditionally denoted by the letterj, with a subscript indicadng the 
order. Thus, the function shown in figure 8.20 would be written y=) 0 (x). Bessel functions of 
the first kind resemble damped sinusoids becausetheir peak amplitudes gradual ly di mi nish as 
the Índex x increases. They have the characteristic shape of a cross-section of a vibrating two- 
dimensional object such as a drum head. We might imagine it could be a stop-action photograph 
of a water wave emanadng from where a drop of water fell into a pond. 

8.8.2 Stretched C ircular M embranes 

The fundamental frequency f of a vibrating membrane is conventionally given as 

f= 0J^5 IT Stretched M embrane (8.22) 

d *¡g 

whereaistheareadensity.disthediameter,andTisthetension. Hall (1980)specifiesareadensity 
foraMylartympani head (with 2 mmthickness) asa = 0.26kg/m 2 .Tympan¡ drumscomeinmany 
sizes, and their tightness isfrequently adjusted during performance. But assuming atympani drum 
with d = 0.6 m and T = 2 x 10 3 N/m, the fundamental would bef= 112 Hz, around the pitch A 2. 

Bessel functions of the first k¡ndj„(x) can be used to model the vibrating modes of a circular 
stretched membrane. Itiseasiestto start with mode 01, the fundamental mode, shown in plan in 
figure 8.19 and in elevation in figure 8.20. 

The roots of the Bessel function (the places where it crosses zero) indícate the location of the 
nodes of the concentric modes. C i rcular stretched membranes must al ways have a node where they 



C hapter 8 


® i O' 003 



02 J O’ 002 !2 J 1, 012 22 ] 2 , 022 32 i 3' 032 



areclamped attheedge. For instance, figure 8.20 shows thatthefirst roots of ) 0 (x) are x =+2.4. 
Thus the shape of mode 01 vibration is defined as ] 0 (x) over the range of approximately 
-2.4 <x< 2.4, labeled z x in figure 8.20. If thissection of the Bessel function isspun 360° around 
itsy-axis to create a circular surface, weget the shape for mode 01 shown in figure 8.21. 

The location of the next roots ofj 0 (x) areatx = +5.5, labeled z 2 in figure 8.20. Thissection of 
i 0 (x) corresponds to mode 02 in figure 8.21. This mode has two circular nodes, one at the outer 
edge and the other about halfway toward the center. If the radius of the outer node is 1.0 m, the 
radiusof the inner node would be about 2.4/5.5 = 0.436 m. 

Following this pattern, the shape of mode 03 vibration is defined as j 0 (x) over the range of 
approximately-8.6 <x< 8.6, labeled z 3 in figure8.20. 

For circular membranes, the frequencies of the modes are given by 

f mn -f ■ P mn , Drum Head Mode Frequencies (8.23) 

where f is the fundamental frequency of the membrane, and is the nth root of the mth-order 
Bessel function of the first kind. (For convenience, I count thefirst root of the Bessel functions as 
n = 1.) Figure 8.22 shows) m (x) form = 0,1,2,3. The function is just the list of all the places 
where the Bessel functions arezero, sorted by Bessel function order. 

Unfortunately, the roots of the Bessel functions are notevenly spaced, and no simple equation 
is known for finding them. Fl owever, wecan approximate their val ues. For i nstance, we've al ready 
observed that (3 01 = 2.4, and p 02 = 5.5. Thus, by (8.23), if the frequency of f 01 = 240 H z, the fre¬ 
quency of f 02 would be about 550 Hz, and so on. 

The equation used to plot the vibration pattern of the drum modes shown in figure 8.21 is 


) m (rx) cos(m<¡>) cos(tx), 


Drum Vibration (8.24) 





Vibrating Systems 



wherem is the number of circular nodes in the mode, r and <j> arethe polar coordinates of the point on 
the surface of the drum bei ng eval uated, and x is the val ue of the nth root of the mth-order B essel func- 
tion. Forexample, the surface of mode 01 (figure 8.21) is defined asm = 0,x = 2.4 (which isthefirst 
root of i 0 , referred to as (3 01 ). We can evalúate any point on the surface of the drum by specifying its 
location in polar coordinates via radiusr, which vari es over the unitdistanceOtol, and which vari es 
from 0to2rc. The parameter tspecifi es the phaseof the drum mode's vi bration. If thisfunction is plotted 
suchthatfgoesgraduallyfrom0to2jt/x,weseeonecompletevibrat¡onofthedrumhead forthatmode. 

Equation (8.24) was used to plot the concentric and radial modes shown in figure 8.21, which 
correspond to the modes shown in plan in figure 8.19. M ode 01 has only one concentric node at 
theouter edgewhereitisclamped. The On modes (consisting of the set of modes 01,02,03,...) 
arestrongly excited when energy is injected into thecenter of themembrane. M ode 01 makesthe 
surface move uniformly up and down and radiates energy into thesurrounding air very efficiently 
because it pushes air directly away from the membrane’s entire surface. Consequently, the energy 
in this mode radiates into the surrounding air very quickly, and the sound dies away rapidly— so 
rapidly that for most drums one simply hears a thump from this mode after the mallet strikes it. 
Succeeding On modes radi ate progressively less efficiently than mode 01, so the energy gi ven them 
by theiniti al mal let strike isconserved through ti me. Thusthey contri butesl ightly moreto the ring- 
ing sound of the drum because they dissipate less quickly. 

The ln modes (modes 11,12,...) arestrongly excited when the drum isstruck between the cen- 
terand outeredge. M ode 11 radiates sound less efficiently than mode 01 because itmerely sloshes 
the surroundi ng air Iaterally back and forth from one side of the membrane to the other as the two 
halves alternately rise and fall. Because little energy is dissipated, this modestrongly contributes 
to the sound of the drum through time. H igher ln modes contribute progressively less energy to 
theoverall sound. 

Studies show that the modes that most strongly contri bute to the ri ngi ng tone of a ti mpani drum 
includethe 11, 21, 31,41, and 51 modes. 

The frequencies of the stretched membrane partíais can al so be approximated with the formula 
f. 1.2*/ñ,wherefisthefundamental frequency givenin equation (8.22) andn isthepartial number. 




270 


C hapter 8 



Figure 8.23 

Stretched membrane modefrequencies. 


Figure8.23 showsaplotof thefrequency coefficientsforthefirst 12 modesinfigure8.19.The 
solid line behind the coeffi ci ents i n figure 8.23 is an approximatefitted curve. 

Notice that the higher partíais of a stretched membrane increasingly crowd together with 
increasing mode number, in contrastto the transverse bar, where higher partíais spread apart with 
i ncreasi ng mode number. This accounts for the dense sound of drums i n comparison to bar i nstru- 
ments: drum partíais tend to stack up closer and closer together as frequency rises. 

8.9 Resonance(Continued) 

Thissection continúes thediscussion of resonance that began with theHelmholtzresonator(insec- 
tion 8.3.3). The aim here is to create a solid framework in preparation for a moredetailed math- 
ematical treatment in volume 2, chapter 6. 

Resonance lies atthe heartof virtualIy every kind of musical instrument. 

R esonance is the tendency of a system to víbrate sympathetícally at a particular frequency ¡n 
response to energy induced at that frequency. 

Resonance requires two elements: a driving forcé, represented as a function of time, r(t), and a 
driven vibrating system such as a spring/mass combinadon. A system that contai ns both these ele¬ 
ments is cal led a driven harmonic oscillator. The driving forceistheinputto the vibrating system, 
and the torced motion is the output. This is in contrast to the free motion of a vibrati ng spring (see 
figure 8.4), which receives no external forcé after i n i ti al excitation. 

Whilethedrivingforcér(t) can beany function, wegetaclearerview of resonance by observing 
periodic inputs, and wegettheclearestview from studying sinusoids, such as 

r(t) = A eos cot, Driving Forcé (8.25) 

where t is time, A is the driving amplitude, co = 2 nf, and f is the driving frequency. Of course, 
morecomplicated periodic and nonperiodic signáis can beused, but herel limitthediscussion to 
sinusoids. 



Vibrating Systems 


271 



Figure 8.24 

Driven harmonic oscillator. 

8.9.1 Driven Harmonic Oscillator 

M any musical instruments can be broken down into a part that generates vibrating energy and a 
part that modifies vibrating energy to create that instrumenta particular sound. For example, the 
breath of a ilute player is shaped by the resonance of the ilute to produce the ílute’s characteristic 
sound. In orderto understand the vibrations oí such instruments, wecan study an equivalent but 
si mpler system consi sti ng oí a harmonic oscillator driven by a vari abl e-speed motor (íi gure 8.24). 

Whathappenswhenwe víbrate a harmonic oscillator? How can wecharacterizeitsmotion? We 
want to understand how the natural vibrati ng írequency oí the spring/mass system responds to the 
frequency of the driving forcé. 

Thefi rstthing weneed isadrivingforcethatwill produce sinusoidal motion, asdefined inequa- 
tion (8.25). In figure 8.24 the driving forcé is provided by a motion generator that consists of an 
armature attached to a motor shaft. A wheel at the end of the armature is captured in the Slot of a 
horizontal bar that is attached to a vertical bar. Together, the two bars make a T shape. The 
T-shaped bars can only move vertícally between four guidewheels. The motor and guidewheels 
are mounted on a rigid framework, and the mass is restrained so it can only move up and down. 



272 


C hapter 8 



Figure 8.25 

Phases of the driven harmonic oscillator. 

Asthemotorturns, and theT bar configuradon risesand falls in sinusoidal motion, thegenerator 
raises and lowers the mass via the spring. 

Figure 8.25 shows successive snapshots of the System for one rotation of the motor shaft. The 
phaseangleof thearmatureisgiven foreach position, together with its corresponding amplitude. 
C ompari ng the displ acement of the System to the si nusoi dal I i ne bel ow them demonstrates that the 
motion of the System is indeed sinusoidal and is related directly to thephaseof the motor shaft. 

Thevertical di stance travel ed by the moti on generator i s the dri vi ng amplitudes in equation (8.25). 
If the length of the armature is s, then the peak-to-peak amplitude of the generator A = 2s. The rev¬ 
oluti ons per second of the motor (and henee, complete sinusoidal periods of the motion generator) 
corresponds to the dri vi ng f requeney f i n equation (8.25). 

8.9.2 ResponseAmplitude 

The displacement of a spring and mass from moment to moment depends upon the interplay of 
H ooke's law and N ewton’s first law of motion (see section 8.4). The spring/mass System has a nat¬ 
ural resonant vibrating frequeney f r , but now wemustalso take into account the fact that it is driven 
by r(t), the periodic function of the motion generator. When thedriving frequeney equals the natural 
vibrating frequeney, f= f r , the spring/mass System wi II respond by vibrad ng strongly in sympathy 
with the driving forcé. When f * f r , the response of the spring/mass System will be less strong. 



Vibrating Systems 


273 



Figure 8.26 

Responseamplitude. 


The response of the System is the amount of movement made by the mass. If the response of the 
system tothedriving forcé isstrong, the mass at theend of thespring will moveagreatdistance,caus- 
ing the spring to bend. If the response is weak, the spri ng wilI bend very littleor not atall. So wecan 
characterize the response amp/ítude A r as thedifferencebetween thelength of the spri ng/mass system 
when it is not being driven (its resting length) and its length while it is being driven. 

WemeasureA r by observing how much thespring isflexed— eithercompressed or expanded. 
ThusA r is the changein the length of thespring from its resting length. When A r is positive, the 
spring is stretched; when A r is negative, the spring is compressed (figure 8.26). 

Tostudy resonance, wewantto compare the magnitudeof the response ampl itude>A r to themag- 
nitude of the driving amplitudes, momentby moment. 

8.9.3 Visualizing Driven Oscillation 

Let's set the motion generator to a low frequency. A low frequency is any frequency f that is sub- 
stantially lower than the natural vibrating frequency f 0 of theharmonic oscillator. We indícate this 
by requiring f < f 0 . N ext, we position the armature so that it i s horizontal and facing to the right (the 
position shown in the leftmost drawing in figure 8.25), and switch it on. As it begins turning coun- 
terclockwise, the rising forcé of the motion generator displaces the spring, which passes the forcé 
along to the mass. By Newton's first law of motion, the inertia of the mass applies a counterforce 
tothechangeinapplied forcé.Thespring, being flexible, stretchestomakeup thedifferencebetween 
the rising forcé of the generator and the counterforce of the mass's inertia. As the spring stretches, 
by Hooke's law, it applies a greater forcé to the mass, which consequently accelerates upward. 

As the armature rotates toward the vertical position, it no longer lifts the spring/mass system 
so quickly, but the mass continúes to rise because of Newton's first law of motion. The 
spring— squeezed between the generator and the mass— compresses until its counterforce balances 



274 


C hapter 8 



Figure 8.27 

Resonant spectrum. 


the upward forcé of the mass. A s the armature starts down, the spri ng is further compressed, i ncreas- 
i ng the forcé on the mass unti I i t accel erates dow nward as wel I. The rest of the cycl e conti núes in this 
manner. 

8.9.4 VaryingtheDriving Frequency 

A s we i ncrease the drivi ng frequency f, the forcé suppl ied by the generator to the mass i ncreases, 
the counterforce of the mass's inertia increases, and the flexión of the spri ng increases to com¬ 
pénsate. Consequently, the magnitude of the response amplitudeA r grows. 

WemightexpectA r to continué to grow as we i ncrease the drivi ng frequency, butwhen f (the 
driving frequency) is equal to f r (the resonant frequency), A r stops growing and, for higher fre- 
quencies, beginsto shrink. A sf continúes to increase, A r shrinkseven more. Let'sdefinemax/mum 
amplitudeA max as the amplitude at which A r achi eves i ts g reatest valué, and the resonant frequency 
as the frequency f r at which A r = A max . Then, by definition, the máximum response of the 
spring/mass System to the generator occurs when f = f r (figure 8.27). 

Stiffness-Limited Vibration For very low driving frequencies, wheref is nearzero, theaccel- 
erati on appl i ed to the mass by the generator i s smal I, so the i nerti al counterforce of the mass i s al so 
smalI. Since the forcé of the spring's stiffness is much greater than the counterforce of the mass's 
inertia, the displacement of the mass closely tracks the displacement of the spring, which in turn 
closely tracks the displacement of the driving forcé. Since the mass, the spring, and the driving 
forcé are al I movi ng together at the same speed i n the same di rection at the same ti me, they are ¡n 
phase. F or frequenci es bel ow resonance, the response ampl itude A r is stiffness-limited because the 
spri ng’s stiffness I i mi ts the magnitude of A r 

A11 ow f requenci es most of the energy expended by the generator to accel erate the mass i s stored 
in the mass as kinetic energy (and the rest is stored in the spring asflexion). All energy stored in 
the mass (and the small amount stored in the elastic forcé) is returned to the generator when the 



Vibrating Systems 


275 


massisdecelerated by thegenerator. So over ti me no work isdone. (Of course, someenergy is dis¬ 
si pated becauseof friction, which isignored here.) 

Inertia-Limited Vibration For high frequencies, where f r <= f, the acceleration applied to the 
mass by the driving forcé is very large, and so the inertial counterforce of the mass is al so very large. 
The spring is literally caught between thesetwo forcesand mustflex quitefar to span thedistance 
betweentheacceleratinggeneratorandthelaggingmass.Themassw¡ll barelyhavebeguntoaccelerate 
¡nonedirectionbeforethespringstartstuggingatitfromtheotherdirection.Asaconsequencethemass 
moves lessand I ess i n ei ther di recti on asfrequency ri ses above f r The response amplitudes ¡sinertia- 
limited for frequencies above resonance because the mass's inertia limits the magnitude of A r 

M ost of the energy expended by the generator to accelerate the mass is stored in the spring as 
flexión (some is stored i n the mass). A11 energy stored i n the elastic forcé (and energy stored i n the 
mass) is returned to the generator when thespring isunflexed (and the massisdecelerated). So over 
time no work isdone. 

Dissipation-Limited Vibration Note that the energy stored by thespring and the mass iscon- 
served (see section 4.17). The energy dissi pated by the system consists of such nonconservative 
forces as heat and sound radiation. The conservative forces maintain the resonance; the noncon¬ 
servative torces dissipate or radíate the system's energy avjay. 

N ear the resonant f requency, w here f r = f , the el asti c forcé and i nerti al forcé come i nto bal anee. 
The relative positions of the mass, spri ng, and armature are such that the generator performs pos- 
itive work on the mass throughout its eyele, so energy flows constantly from the generator to the 
mass through the spring. The phase of the generator leads the mass by one quarter of a eyele. The 
spring and mass trade energy between each other, never returning it to the generator. 

If the spring/mass system conti nuously absorbs energy from the generator without ever return- 
i ng any of it, we might expect that A r would grow without bound. That’s true except for one thi ng. 
A r tendsto grow without bound at resonance, and the velocity of the mass also tendsto grow with¬ 
out bound. But the increased velocity causes energy to be dissi pated at a faster rate, radiated as 
more intense sound and heat. Atsomevalueof A r , the energy being received by the spring/mass 
system from the generator balances the energy bei ng dissi pated by the spri ng/mass system, and the 
amplitude reaches its máximum, A max . 

The amplitude of the oscillation is dissipation-limited when f= f r This suggests an alternative 
definid on of resonant frequeney: 

R esonant frequeney ¡s the frequeney that is most effective at enabling a vibrating system to 
return to its original energy level by dissipatíon. 

Given the propensity of Systems to seek the most efficient way to return to their original energy 
levels, itseems entirely reasonablethat the world should befilled with resonant systems. 

If the rate of energy dissipation is small, A max can become large enough to destroy the system 
because there is nothi ng to stop the escalad ng amplitude of the mass as it conti núes to receive energy. 



276 


C hapter 8 





Figure 8.28 

Tacoma N arrows B ridge disaster. 



Low damping 
M edium damping 
High damping 


F requency 


Figure 8.29 

Effect of damping on resonance. 

The well-documented catastrophic failure of theTacoma Narrows Bridge is the often-cited case in 
pointforthisphenomenon (figure8.28). (Thecommon explanation thatthebridgefailed becauseitres- 
onated with afrequency component of the howling wind is not necessarily incorrect, but in fact itwas 
probably not the simple linear resonance being described herethat destroyed the bridge. Theexciting 
forceof thewind wasitself affected by the vibrational responseof the bridge. The resultwasarecursive 
nonlineardynamic System (Lazerand McKenna 1990; McKenna 1999) (seesection 8.10.1)). 

8.9.5 Damping 

What is the effect of variousrates of dissipation on resonance? Damping refersto how efficiently 
energy can be dissipated by a vibrating system. 

Suppose we increase the amountof friction the massundergoeswhilemoving upand down (see 
figure 8.25). Wecould do this, for example, by suspendíng the mass in a liquid of some kind. The 
viscosity of the fluid resists the vertical vibrating motion of the mass in proportion to the rate at 
which the massisdrawn through the fluid: thefaster itsvelocity, thegreater thedrag. The effect 
of greater damping on a resonant system is to reduce and broaden the resonant curve (figure 8.29). 


Vibrating Systems 


277 



Figure 8.30 

Quality factor. 

N ote that the peak resonantfrequency declines si i ghtly asdamping increases, as i ndicated by the 
curved linedrawn through the peaks in figure 8.29. 

Different degrees of resonanceare required for different purposes in musical instruments. The 
resonance peak of an organ pi pe must be very narrow so that it sounds just one frequency. B ut the 
resonance of apiano sounding board should beasbroad and fl at as possi ble so that i twi II respond 
to al I frequenci es the same. Si mi I arly, i t i s i mportant for I oudspeakers to have as broad a resonance 
as possi ble so as not to overly color the sounds they reproduce. Since damping broadens the res- 
onant peak, pianos and loudspeaker Systems often are designed to be highly damped. 

8.9.6 Bandwidth and Q uality Factor 

A s shown i n figure 8.30, we can characterize the sharpness of a resonance by compari ng its height 
to its gi rth at some particular di stance down from the top of the peak. Starti ng f rom A max , the apex 
of the curve, we drop down a distance of 3 dB. 12 The frequencies where this line intersects the 
skirts of the curve are f 0 and f v and the span of frequencies Af = - f 0 is the bandwidth of 

the resonator. The ratio of the resonantfrequency to bandwidth 3 dB down from peak amplitude 
i s a frequency-i ndependent measure of the steepness of the curve that engi neers cal I quality factor. 
Itisdefined as 

Q=¿. Quality Factor (8.26) 

Q i ndicates how much more a dri ven osci I lator absorbs power at its resonant frequency than it does 
ata standard distance from the resonance frequency. In figure 8.29, the most highly damped res¬ 
onance has thelowestQ. 

8.9.7 PhaseDelay 

For the harmonio osci Ilator in figure 8.25, if we plot the phase delay between the angular position 
of the motor arm and the linear position of the mass for various valúes of Q (figure 8.31), we see 
that the higher the Q, the more abruptly the System transitions from in-phase to out-of-phase 
motion. For a high-Q resonator at low f, phase delay remains nearzero until f nearly equals f r , at 



C hapter 8 



- Low Q 

- M edi um Q 

HighQ 


Figure 8.31 

Phase del ay for various quality factors. 


a) 



Frequency Frequency 


Figure 8.32 

Combining resonances. 


which pointsmall additional increments in f result in large i ncreases in phase del ay. On theother 
hand, highly damped (low-Q) oscillators build up phase del ay gradually. 

8.9.8 Resonance with M ultiple Degreesof Freedom 

Theseideas aboutresonancecaneasilybeextendedtomorecomplicated vi brating Systems with múl¬ 
tiple degrees of freedom. Each vibrating modesimply has its own resonant response, characterized 
by a resonantfrequency f r and Q (figure8.32a). Thetotal response of sucha System i sthecombination 
of these resonant curves (figure 8.32b). 


8.10 Transiently Driven Vibrating Systems 

W hen a performer starts to pl ay a note on a sustai ni ng i nstrument such as a pi pe organ, the vi bra- 
ti on of the i nstrument bui I ds up gradual ly over ti me duri ng the onset, or attack, phase of the note 
(figure 8.33). W hen the performer stops playing, it gradually returnsto si lence duri ng the decay 
phase. The attack and decay phases are known collectively as transients. 

Attack Suppose we set the speed of a driven harmonio osci I lator's motor to its resonantfre¬ 
quency f r , then switch on its power. This would be analogous to blowing into a flute or organ 
pipe, bowing a string, starting to sing, and, in general, beginning a sustained tone. Even if the 



Vibrating Systems 


279 



Attack Sustai n Decay 


Figure 8.33 

Amplitude envelopeof a harmonic oscillator. 

motor starts turning instantaneously, it still takes time for the response of the System to reach 
A max because 

■ T he mass/spri ng system absorbs energy at a constant rate, causi ng the ampl itude of vi brati on to 
grow. 

■ As the amplitude of vi bration grows, the system dissi pates energy atan increasing rate. 

Becausethe rateof dissipation increases with increasing amplitude, growth in ampl itude gradual ly 
slows as dissipation approaches equilibrium with the applied forcé. The higher the Q, the greater 
is the system's energy storage capadty atfrequency f r , and the I onger it takes to reach an equilib¬ 
rium between the applied forcé and dissipation. 

Steady State When energy is dissipated at the same rate that it is applied, amplitude growth 
stops, and a steady State is achieved. We say the resonator is ring'mg at f r . 

Decay W hen the appl ied energy is withdrawn, we enter the decay phase, and the system behaves 
exactly as described in section 8.4. In a highly damped system (low-Q), vibration ceasesquickly 
because the dissipation rate is high. But a high-Q resonator has little dissipation, so the energy 
drainsaway more slowly. 

Release A fourth State, release, characterizesthefinal sound some instruments make if they are 
stopped from vibrating by dampers. For example, when lifting the key on a harpsichord beforethe 
tone has died away, there's a slight buzz as the damper presses down on the string to stop it from 
vibrating. 

8.10.1 Resonance, Recursion, and Xeno's Paradox 

Suppose we examined the amplitudes of the decay at regular intervals, and tabulated a list of 
theresultsover N sampletimes. We’dhaveasequenceof samples{A 0 ,A 1 , ■ ■ ■ ,A N _ 1 }. Weselect 
one of those samples, A n , where 0 <n<N - 1. We can compute the next valué in the sequence, 
A n+1 , by multi plyi ng S n by some factor 0 < d < 1, correspondí ng to the rateof dissipation. If the 




C hapter 8 


Uní 


ln 


T 1 T T T T T T 


Figure 8.34 

Exponential decay. 


ampl itude of the current cycle ¡SiA n and thedissipation isd, then the ampl itude of thenextsample 
will be 


A n+l~ A n 


■ d. 


(8.27) 


So the energy i n the vi brati ng system at each moment depends upon how much energy there was 
in the previous moment, times a constant factor that determines how much will be dissipated 
away as sound and heat. M athematicians cali such Systems recursive; physicists cali them 
dynamical. 

We can eval uate the decay curve by I etti ng a n = a n+1 and then repeati ng (8.27) forever to generate 
the restof the curve. Forexample, ifweseta 0 = landd=0.9,wehavethesequence{1.0, 0.9,0.81, 
0.729,...}. Plotting these points reveáis an exponential decay (figure 8.34). Thisfunction never 
reaches zero. 

A curious aspect of resonant systems is that, theoretically, they never stop vi brati ng. This is an 
acoustic incarnation ofXeno's paradox (see appendix AJ.Supposewemeasuretheamountof time 
ittakesfor the ampl itude of a noteto drop to one half of A max and cali it t 1/2 , th ehalving time. If 
A(t) isa measure of the ampl itude at ti me f, andA(O) =A max , then wecan express halving ti me as 


A (0) A max 
A(ti /2 ) 2 


(8.28) 


S i nce the ampl itude drops to 1/2, the vi brati onal energy in the system dropstoonefourth of itsoriginal 
valué at t 1/2 , and 

A 2 (0) _ Am ax 
A 2 (t 1/2 ) 2 2 ■ 


If wewaituntil an additional timeinterval k,2 has el apsed, the ampl itude w i 11 be onefourth and energy 
one eighth of the original. At each subsequent time interval tl/2, the ampl itude will again behalved 
and theenergy quartered, butthereisstill energy presentproportional to what was there before. Thus, 
unless we wait for eternity, the ampl itude never reaches zero (unless it was zero to begin with). 

The target valué that the ampl itude is heading toward (but never reaches) is the asymptote. 
During the attack phase, the asymptote ¡sA max , during the decay phase it is zero. 


The Exponential Function and the Time Constant We'veseen that the transí entsof all linear 
resonant systems— where the rate of energy loss or gain is proportional to the current energy— 
havea characteristic exponential shape. This ineludes virtually all musical instruments and sound 




Vibrating Systems 


in reverberant spaces. Theexponential function commonly used in musical applicationsto model 
this is 

y = E(t) ~A max e~ t/T , Exponential Decay (8.29) 

whereA max is peak amplitude, time t> 0, and the time constantx is the characteristic rate of 
decay. The asymptote of the exponential decay function is zero. 

Conventionally, x is the ti me ittakes for£(t) to decay by 1/e, that is, 


iíi) = l = -L = 0.36, 
E (0) e “ 2.78 _ 


(8.30) 


corresponding to a drop of 10 log(l/e) = —4.34 dB SIL (seeequations (5.31) and (5.32)). 

The attack envelope is the inverse of (8.29): 

y = E (t) = A max — A max e t/T . Exponential Attack (8.31) 


The asymptote of the exponential attack is A max . Figure 8.33 shows an example of exponential 
attack and decay envelopes. 

Solution to the Paradox What's the solution to the paradox of the never-ending exponential 
envelope? 

Often, the forcé of friction in a vibrating system increases at low amplitude, erasing the little 
remaining energy i n the vibrating system atan accelerated rate and helping to mark theend of its 
sound.A nonl inearfriction function atlow velocity al so explains why the brakes i n a car sometimes 
start to grab just as it approaches a complete stop. B ut even if energy is depleted at an accelerated 
rate, there’s theoretically still somethereforever. 

Perceptually, adecaying sound will become inaudible if it drops below thethreshold of hearing 
ortheambient noise level, whichever is higher. Sotheempirical solution to this versión of Xeno’s 
Paradox is to decide on a time after which we consider the amplitude to be insignificant. 


T60, Decay Time, and theMeaning of Silence How long does ittake for sound in a concert 
hall to decay into silence? The reverberation time of a hall (seesection 7.13.2) is one of the key 
determinants of its acoustical quality: if reverberation lasts too long, music and speech tend to 
become blurred, reducing intelligibility. A short reverberation ti me may improve i ntelligibil ity but 
may make the room sound dead. 

A ruleof thumbused widely inarchitectural acoustiesisthatasound with initial arnplitudeA max 
becomes insignificant after it has decayed by 60 dB SPL. This time, called t 60 , is a measure of the 
reverberation timeof a room. Somecathedrals havea t 60 timeof about 10 seconds or more; the 
t 60 timeof concert halls usually lastsafew seconds. The t 60 timeof a bedroom may beafew miI- 
I¡seconds, and the t 60 time of a good anechoic chamber should be vanishingly cióse to zero. 

S i nce - 60 dB = 201 og 10 0.001, t 60 can be thought of as the ti me i t takes A max to decay by a factor 
of 1000. That is, ifA(t) is the amplitude of a sound attimetand A(0) =A max , then 



C hapter 8 


^ (^6q) . ^max 

>4(0) ' 1000' 

Wecan relate t 60 to the halving ti me t 1/2 by noting that>4(t 60 ) correspondsto aboutten halvingsof 
amplitude, that is, 

^ (^6q) _ ^max _ ^ (10 ' ^ 1 / 2 ) 

>4(0) “ 1000 4(0) ' 

so t 60 = 10t 1/2 . Similarly, we can relate t 60 to the time constant t by sol vi ng (8.29) for x: 
ln(1000)x = 6.91x. Thus, t 60 isjust under seven time constants long. The energy being radiated 
at ti me t 60 isthe square of the amplitude, or 

(¿max) 2 Q4 max ) 2 

uoooJ 1,000,000' 

which corresponds to a drop in sound intensity of 60 dBSIL. If the original intensity is, say, 
100 dB, then 100 dB - 60 dB =40 dB SIL, approximately the same as thethreshold of ambient 
noisein quiet listening environments, which is a workabledefinition of silence. 

8.10.2 Why High-Frequency Components DieOut Faster 

Vibrating Systems with many degrees of freedom have múltiple resonances, and overall the 
response of the System i s a composi te of the i ndi vi dual resonances of the modes (see secti on 8.9.6). 
It follows that each degree of freedom n has its own damping t¡mex n . CharacteristicalIy, high- 
frequency vibration modes have the smallestx n . 

To see w hy, consi der two i denti cal masses m 1 and m 2 vi brati ng i n si mpl e harmoni c motion wi th 
identical amplitudes butwith frequencies f 2 < f 2 . By equation (5.27), weknow that the energies 
of the two masses are E 1 = m^nA f 2 ) 2 and £ 2 = m 2 (2nA f 2 ) 2 , and so E 2 < E 2 , that i s, m 1 has less 
energy than m 2 . Since at any moment of time, the rate at which energy is radiated is proportional 
to the total amount of energy, m 2 1 oses energy faster than m v and its vi brati ons are damped out more 
quickly (Ruiz 1969). 

Figure 2.24 shows the evolution through time of the harmonics of a musical instrument tone. 
Each instrument has a characteristic way in which its partíais evolve through time, which can be 
understood by examining the interplay of the vibration modes, theforces the player exerts upon 
the instrument, and the coupling of the instrument to the air (see volume 2, chapter 6). 

8.11 Summary 

Weexamined themathematical formulas that determine the sound of conventional musical Instru¬ 
ments. We combined Hooke's law with Newton’s second law of motion to observe in detail the 
movement of simple harmonic motion. 



Vibrating Systems 


A vibrating system hasdegreesof freedom, which can interactadditively. M usical instruments 
can beorganized by the si mi larity of their mathematical equations.The parameters are dimensión, 
restoring forcé, and vibrating element. 

We consi dered the mathemati es of stri ngs, stri ng modes, and standi ng and travel i ng waves. B ars 
and stri ngs are governed by the same equati ons, but because bars are stiffer, they are nonharmonic. 
Young's modulus is used to determine the vibrating properties of stri ngs and bars. 

We examined air columns in light of the Helmholtz resonator and developed models for the 
vibration of pipes open atone end orboth ends, which differfrom models of conical pipes. Drums 
are two-dimensi onal vibrating anal ogues of stri ngs and bars. Their vibrating modes arecharac- 
terized by Bessel functions of thefirst kind. 

The discussion of resonance was continued by examining the behavior of a driven harmonio 
oscillator. The behavior of a resonator changes as a function of driving frequeney, resonant fre- 
queney, and theamount of damping in the system. The result is characterized in termsof quality 
factor Q and phase del ay. We consi dered transí entl y driven vibrating Systems, suchas when a musi¬ 
cal instrument starts and stops a note, and observed a paradox related to Xeno's. 

8.12 Suggested Reading 

AskilIJ. 1979. ThePhysicsof Musical Sound.Ne w York: Van Nostrand. 

Moravesik, M .J. 1987. Musical Sound: An Introduction to thePhysicsofMusic. New York: Paragon. 

Pierce.John R. 1983. The Science of Musical Sound. San Francisco: Scientific American Books. 

Rigden.J. S. 1977. P hysics and the Sound ofM usic. New York: Wiley. 

Rossing, Thomas D. 1983. The Science of Sound. Reading, Mass.: Addison-Wesley. 

W hite, H. E„ and D. H. White. 1980. Physics and Music. New York: Holt, Rinehart, and Winston. 

Wood. A. 1975. The P hysics ofM usic. New York: Wiley. 




Composition and Methodology 


[The Analytical Engine's operating mechanism] might act upon other things besides number, wereobjects 
found whose mutual fundamental relations could beexpressed by thoseof the abstractScienceof operations, 
and which should be also susceptible of adaptations to the action of the operating notation and mechanism 
of the engine... Supposing, for instance, that the fundamental relations of pitched sounds in the Science of 
harmony and of musical composition were susceptible of such expression and adaptations, the engine might 
compose elabórate and scientific pieces of music of any degree of complexity or extent. 

— Ada Lovelace 1 

The best view of musical composition is provided by a study of methodology. So understanding 
methodology is the fi rst aim of this chapter. The subject of methodology encompasses most human 
acti vi ti es, i ncluding the arts and sci enees. A pproachi ng composi ti on thi s way has the great advantage 
of enabling usto relate the arts and Sciences, to see their similarities and differences in sharp relief. 

Studying the methodology of composition provides a crisp and efficient way to identify and 
compare theaesthetic aims of particular composers and schools of composition. This is of great 
benefit because we can then accurately compare and contrast the wide panorama of interests and 
valúes that have concerned composers over the ages. 

Thesecondaimof this chapter isto study the developmentof artificial composi ng Systems. Per- 
haps surprisingly, the foundations of this field are over a thousand years oíd. The implications of 
these ¡deas stirfar-reaching and provocativequestionsaboutthe natureof music and composition. 

T he thi rd ai m of thi s chapter i s to show how composi ti onal pri ncipl es can be expressed i n a Com¬ 
puter programming language and to develop a set of tools that can be adapted for readers’ own 
music research. A simple but powerful music programming language called M usimat is pre- 
sented, and many of the methods discussed are shown in M usimat code. The chapter provides 
computad onal strategies for composi ng music and insights about the natureof composi ng. 

9.1 G uido's M ethod 

A round 1026 the learned B enedicti ne and famous music theoreti cían Guido d'Arezzo developed 
a way to teach his students composition which, in his reíigious environment, amounted to com- 
posing plainchant melodies to accompany sacred texts in Latín. This same Guido invented the 




Figure 9.1 

Guidonian hand. (Adapted from A peí 1944.) 

medieval music theory of hexachords, agroup of six diatonic toneswith asemitoneinterval in the 
middle, e.g.,C LJ D LJ E v F LJ G LJ A. He al so assigned letter ñames to the di atoni escale and inventedthe 
solmization syllables ut (do), re, mi, fa, sol, la that are familiar to music students. 

Guido developed a method of memorizing the musical scaleand itssolmization syllables, his- 
torically called the Guidonian hand (figure 9.1). By pointing to parts of the hand a choir master 
can indi cate the next note of themelody to besung. Although it isjusta simple way of associating 
pitches with positions on the hand, it carne to epitomize the entire System of thechurch modes in 
medieval E urope. 11 became such a powerful metaphor that conservative music theorists of the late 
M iddleAges used i t to resi st the i ntroducti on of chromati ci sm by say i ng that the new seal e degrees 
were "not in the hand" (Apeí 1944). 

Guido'scombination of theoretical prowessand practical aptitudelaid thefoundationsof obyec- 
tive composition, which I define as the use of naturalistic (nonsubjective) processes in composi- 
tion. Although weassociatethedevelopmentof automated composition with thetwentieth century, 
Guido’s work demonstrates that it is a quite ancient practice. 

Guido published his composition method for students as part of a treatise for singers titled 
Micrologus. This was an important source for the development of organum, the earliest type of 
polyphonic music in Europe. There is debate as to whether Guido was seriously proposing his 
method asa means of composing music, or if it was justa didactic aid forteaching composition. 
But in any event, he managed for the fi rst ti me in history to objectify a way of composing music 
i nto a def i ni te set of rules. Although elementary, his method i s the prototy pe for al I objectivecom- 
positional Systems from that day to this. It can be used for thought experiments on the general 
nature of composition. 

G uido’s fi rst step was to construct a table of correspondentes between the notes of the seal e and 
the vowels contai ned in theLatin textthatisto be setto music. First he laid out the pitches of the 
double octave, which was the standard compass of vocal music of his time (figure 9.2). Against 



C omposition and Methodology 


rABCDEFGabcdefga' 



Figure 9.2 

Vowel/note correspondence. 


Guido s method 

rABCDEFGabcdefg- 

Wt 


Figure 9.3 

Block diagram for Guido's method. 


this he placed three iterations of the vowel sequence, a ei o u: 
r A B C D EFGab cdefg a' 

a e i o u a e i o u a e i o u a 

Guido then selected a Latín text and extracting the vowelsfrom each word, set about looking up 
correspondí ng pitchvaluesfrom histable. Since this method suppliesthreechoices for each vowel 
(and fourchoices for the vowel a), the method has múltiple Solutions for each text. Following this 
procedure, hecomposed amelody for the enti re text that changed pitchon every vowel. Figure 9.3 
shows a block diagram outlining Guido's method. There are two inputs, the vowels of the Latín 
text and the choi ces made by the subjective j udgment of the composer that determi nes from whi ch 
vowel group to draw the pitch. The output is the resulting plainchant melody. 

For a one-vowel text, there are 3 possi ble one-note "melodies"; for atwo-vowel text, 3 2 mel- 
odies of two notes; and for an N-vowel text, 3 W melodies. Thus, the number of possi ble melodies 
grows explosively for longertexts. Guido suggested thatanyonewhofelthissystemwas toocon- 
straining should expand it by addi ng another I i ne of vowels under the notes with a different start- 
ing point, doubling the number of choices. Even so, the method still constrains the choices of 
a composer, who otherwise could choose any pitch at any time. We assume this was Guido's 
intention, to ease his students into the deep ocean of unlimited possibil i ti es with small steps by 
the shore. 

Buteven if choiceisconstrained by the method, composers still mustexercisetheirsubjective 
facultiesto develop a pleasing and musically interesting line. Guido suggested that by selecting 
only the best excerpts from several attempts, composers could obtain a composition perfectly 
adapted to the text and meeting the requirements of good compositional practice. 







C hapter 9 


9.2 M ethodology and C omposition 

M ethodology iswhatallowsustoconstructthewheel, plantcrops, solveequations, and writesym- 
phonies. M ethodology istheDNA of human culture: itcarries Information thatenablessocietiesto 
function and to persistfrom generadon to generadon. The proper study of composition isthestudy 
of methodology, and thefull appreciation of methodology requires an understanding of algorithm. 

9.2.1 Algorithm 

Algorithm isthemosthighly qualified methodology. The word comes from a/gor/sm, which means 
to calcúlate with A rabie numeráis. 2 According to Donald Knuth (1973) algorithm isa broader con- 
cept, coven ng any set of rules or sequence of operad ons for accompl ishi ng a task or sol ving a prob- 
lem so long as it demonstrates each of thefollowing five characteristics: 

■ Finiteness The method must not takeforever. 

■ Definiteness Each step must have a significance that is commonly understood. 

■ Input The method must have val id materials or Information upon which to opérate. 

■ O utput The method must produce at I east one resul t, generated by applyi ng the method to the i nputs. 

■ Effectiveness Themethod must always produce the same outputfrom the same input; theresult 
must not depend upon unknowns (e.g., a miracle, a coin toss, or the phase of the moon); and there 
can benoambiguousoutcomes(e.g.,d¡v¡d¡ng by zero isnotallowed becausetheresultisundefined). 

A method that meets all these requirements is calIed algorithmic. 

A ccordi ng to Knuth, methods al so display aesthehc traits, or "goodness." These inelude effi- 
ciency, simplicity, grace, elegance, parsimony (no extraneous steps or rules), and tractability (eas- 
ily adapted to a vari ety of circumstances). A method's goodness ¡sal so demonstrated by how well 
it reveáis our understanding of theproblem being solved. 

9.2.2 Eudid's Method 

By way of example, consider the problem of finding the greatest common divisor (GCD), which 
is the greatest number that divides two numbers without remainder. This comes in handy when 
reducing two numbers to their lowestform, so asto reduce i nterval ratiosto their lowest common 
denominator. For example, the GCD of 9 and 12 is 3. Wejust "know" that, buthow do we know 
it, and how can we represent this knowledgeto someoneelse? And how can wefind the GCD of 
91 and 416, which wealmostcertainly do not "know"? Euclid developed thefollowing method to 
solve this class of problem for positive integers. 

Eudid's Method 

1. Given two numbers, m and n both greaterthan zero, find their remainder after integer división. 

2. If the remainder is 0, the answer is n. 

3. Otherwise, let m = n, and letn = r, and startover. 



C omposition and Methodology 


Table9.1 

Euclid's M ethod, 9 and 12 


Step m 

n 

r 

1 9 

12 

9 

2 12 

3 9 

9 

0 

3 

0 

Table 9.2 

Euclid's Method, 91 and 416 

Step m 

n 

r 

1 91 

416 

91 

2 416 

91 

52 

3 91 

52 

39 

4 52 

39 

13 

5 39 

Ü 

0 


The resultsfor 9 and 12 areshown in table9.1, and the resultsfor 91 and 416 in table 9.2. 

9.2.3 I s E uclid's M ethod A Igorithmic? 

Yes, Euclid’s method isalgorithmic. 

■ \t\s finite (itwill always eventually reach r = 0). 

■ Itis defín/te (for posi ti ve i ntegers) becausethemeaningof di visión and remainderingforpositive 
integers is unambiguous. (But if we extend the posi ti ve integer range to includezero, the method 
isno longerdefinite because división by zero is undefined.) 

■ It has inputs, m and n, and an output, the answer. 

■ Itis effective because there i s no miraculous, random, or subjective element in E uclid's method; 
it always gives the correct answer (barring mistakes in arithmetic). 

It meets K nuth's aesthetic criteria as well: it is simple (requiring only three steps), elegant (lovely 
to think about), and parsimonious (it gets straightto the point). It is also tractable because it can 
easily be adapted (e.g., to a Computer). 

9.2.4 IsGuido’s Method A Igorithmic? 

N o, Guido's method is notalgorithmic. Subjectivechoice is requi red for G uido's method, so itfails 
the definiteness criterion. But it meets all other criteria, including Knuth's aesthetic criteria. We 
can cali methodsl i keG uido's nondeterministic methodologies, butl prefera more conci se ñame: 
art. The characteristic feature of art of all kinds is that it combines objective criteria and methods 



290 


C hapter 9 


with choice making. Thedifferencebetween art and algorithm is that deterministic methodology 
(algorithm) always produces the same result from the same inputs, whereas nondeterministic 
methodology may produce variable results even with the same inputs. 

Choice making, such as thinking of a number between 1 and 6, is always subjective. So-called 
objective choice, such astossing asix-sided die, is actually justthedelegation of subjective choice 
to an external process. (H ave you ever flipped a coin to make a choice and then decided to do the 
opposite?) A delegated external choice-making entity isan oracle. If amethod requires Consulting 
an oracle of any kind, it is automatically art, not algorithm. We should al so distinguish between 
choice making and choice accepting. The latter is always subjective. 

9.2.5 W hy Study M ethodology? 

In order to createtheir methods, both Euclid and Guido had to reach insidetheir own subjectivity, 
to hold theirgoalsin mind whilesimultaneously observing theirown mental processeslong enough 
to objectify whatthey discovered intoa set of rules. Becausethis requires considerable mental dis¬ 
cipline, I bel i eve that we only devel op methods w here we care deeply about the ai m of the method. 

This suggests that the study of methods can reveal our valúes and hidden assumptions. For 
example, we observe that Guido’s method constrains the music to follow the words, thereby 
revealing Guido's belief that the purposeof music was to set off the biblical text, much the way 
a ring sets off a jewel. 

Theguiding pri nci pleofthischapter i s that the analysis of methodology can reveal theaesthetic 
agenda of its creator. Thus, by examining the methods of composers, we can understand the inner 
significanceof their music. After building sometoolsand skills, I discusssomeground-breaking 
compositional methods for the purposeof examining their underlying valúes, asa way of helping 
usto establish our own. 

9.3 Musimat: A SimpleProgramming Languagefor M usic 

If we wish to use computers to opérate on music, as Ada Lovelace suggested, we must find ways 
to represent music that both composer and Computer can understand. The representation must be 
intuitive, and yet definite enough to be computable. It must provide expressive control over the 
musical materials we wish to opérate upon. 

In order to study methodologies, we must have a completely definite language with which to 
express them. A programming language is a specialized means of describing rule Systems and 
methods. M usimat isa programming languagedesigned specifical ly for the subjects presented in 
this chapter. A tutorial introduction to M usimat isgiven in appendix B. 

It is possi ble to read this chapter without knowing M usimat. However, I highly recommend 
spendingthetimeneeded to understand itbeforeproceeding. M any oftheexamples in this chapter 
areexpressed in M usimat, and though I summarizeeverything in nontechnical languageaswell, 
readers won’t be able to adapt and use this information without understanding the language in 
which it is written. 



C omposition and Methodology 


291 


9.4 Program for G uido’s M ethod 

WiththeM usimat programming languagewecan program aversión of Guido's method. 

First, wetransform Guido’s vowel sequencesto pitches: 

PitchList guidoPitches = 

{G3,A3,B3,C4,D4,E4,F4,G4,A4,B4,eS,,D5,E5,F5,G5}; 

See section B .2.1 for a description of PitchList. 

Then weneed asourceof judgmentforwhich of Guido's three vowel sequencesshould becho- 
sen. We'll usetheintegerRandom( ) method to generate random valúes. Combining these, we 
obtain the program for Guido’s method: 

PitchList guido(String text) { 

G; //place put the melody 

4sí$teger k = 0; //indexes G 

■jP$fceger offset; //indexes guidoPU^Sies [ ] 

//evalúate one character o#' the text at a time 
For (Integer- i-# % ; i. \ tréngth (text) ‘"i 1) f 

Character c = text[ i ]; //get a character of the text 

'ÍEf ( c == 'a' ) { offset = 

Else If ( c == 'e' ) { off 

Else If ( c == 'i' ) { off 

Else If ( c =*=. 'o' ) { off 

Else If ( c == V ) { ájfi 

Else { ijífset = ¡} u 

VÍf ( offset != -1 ) { //if the character is a vowel. 

Integer R = Random( 0, 2 ); //returns 0, 1, or 2 
Integer n = ( 5 * R ) + offset; 

G[ k ] = guidoPitches[ n ]; 


Return ( G ); //recurrí the ílst of p: eches composed 

} 

The program indexes one character ata ti me of text. If character c isa vowel, itcal- 
culates offset based on which vowel it is. If it is nota vowel, the program sets offset to -1 
so that the final step is skipped. If it is a vowel, the program chooses a random number 0,1, or 2, 
correspondí ngto the three possi ble outeomes for each vowel. This is multiplied by 5, correspond- 
ing to the number of vowels, and added to offset to arrive at the Índex of the selected element 
in the list of guidoPitches. The selected character from that list is then stored in 
PitchList g. The method is repeated until text is exhausted. PitchList g then contains 
the list of pitches composed for this text. As itsfinal action, the PitchList g is returned to the 
cal I i ng program. 









C hapter 9 


To i nvoke the f u ncti on gu i do (), w e need a L ati n text. I’l I use the f i rst ph rase of the text Guido 
used to ñame the solfeggio syllables, the medieval hymn SanctusJ oharines (St. J ohn). This pro- 
gram fragment prints a I i st of pitches: 

Print(guido("Ut queant laxis resonare.")); 

An exampleresultof thismethod isshown in figure 9.3. 

9.5 Other M usic Representaron Systems 

There are a virtual ly unlimited number of approaches to the representad on of music, dependi ng 
upon one'saims. Theaim of M usimat iscompactnessand expressivity for composition. A short 
list of some important music representad on and programming Systems, drawn from the extensive 
I iterature on the subject, follows: 

■ MIDI M usical Instrument Digital Interface, astil I prevalent standard for encoding and trans- 
mitting musical gesturesbetween computersand musicsynthesizers(Loy 1985) providesavery 
simple and concrete mapping from musical keyboards, nobs, and slidersto musical sounds. In 
itsoriginal form no specific mapping of sounds wasstipulated. OneM IDI synthesizer might play 
a particular note usingstringtones, whereasanotherwould usebassoons. M ore recently, General 
M IDI, a standard set of timbres, was adopted. This standard stipulates a common mapping 
of timbres that every conforming synthesizer must implement. Scores played on any General 
MIDI synthesizer real ¡zea si mi lar orchestration (J ungleib 1996). MIDI presentsanormativeand 
limiting conception of music (F. R. M oore 1988), but it is very widespread. 

■ CHARM Common Hierarchical A bstract Representad on of M usic providesaway of looking 
at music that is useful for musicological analysis (Wiggins, Harris, and Smaill 1989). 

■ SCORE A music printing system developed over the last 30 years by Leland Smith, SCORE 
can be used for high-quality printing of common music notation, tablature, and other nonstandard 
musical formats. 

■ DARMS Thisisanearly,overlyambitious(flawedbutinteresting) musicdescriptionlanguage, 
developed by Ray Erickson (1975) and Stephen Bauer-M engelberg. 

•GUIDO A n extensi ble text-based score representad on language for notation software, com¬ 
position, analysis, and performance developed by the Salieri Project at the Technical University 
in Darmstadt(Hoosetal. 1998). 

■ DMIX Developed by Daniel Oppenheim (1996), DMIX combines graphical sound editing, 
algorithmic composition, Computer programming, and real-time interaction and improvisation. 

■ Kyma This is a sound specification languagedeveloped by Carla Scaletti (1991). Itusesagen- 
eral specifi cati on of sounds as the bui I di ng bl ocks of composi don. "The structure of a composi don 
in this language is the set of traces left by the compositional process, that is, each composition 



C omposition and Methodology 


contai nswithin it a record of how itwascomposed.This record serves as oneof themany possi ble 
analyses of the composition" (Scaletti 1989,43). 

■ Max A graphical programm¡nglanguagedevelopedbyMillerPuckette,Maxenablescompos- 
ers to create music-processing Systems by connecting Processing iconson a screen much theway 
one would plug M oog or Buchla analog synthesizer modules together. 

9.6 DelegatingChoice 

The agency of compositional choice (seefigure 9.3) can be delegated from one person to another 
or from a person to an objective process such as rolling dice. 

9.6.1 Subjecti ve Choice 

A composer may del egatechoi ce of musi cal el ements to an assi stant or amanuensi s. T hi s i s a com- 
mon practice, for example, among famous H ol lywood movie composers. A Ithough the head com¬ 
poser may stipulate criteria, the actual composing is done by assistants. By delegating, the head 
composer loses some control over the result. 

Even if a composer writes every note in thescore in minute detaiI, its realization will neces- 
sarily include many chance el ements introduced by the performer. Conventional rules for clas- 
sical performance interpretation arejust oneof the uncertaintiesaffecting thecomposer's music. 
Others inelude who performs it, the venue, the choice of instrumentation and equipment, 
whether it is broadeast or recorded, other compositions on the program, their order in the pro- 
gram, and so on. The composer's instructions may be ignored altogether. M ostof these uncer¬ 
tai nties remain even if the real ization consists of playing prerecorded music. Of course, there 
are many forms of music, such as A merican jazz, where uncertainty predominates because the 
music is more-or-less improvised. 

Even someclassical ly trained twentieth-century composers experimented with delegating addi- 
ti onal el ements of the composi ti on process to performers. A n early pi ece of thi s type was K arl hei nz 
Stockhausen'sK/a v/erstükX/ (1956), where thescore for piano solo consists of 19 disjointed frag- 
ments of music notation of varying lengths, placed on a large sheet of cardboard with plenty of 
empty space between them. Stockhausen indicated that the performer should play the fragments 
i n any order "that catches hi s eye" and shoul d al so choose tempo, dy nami c I evel, and type of attack 
during the performance. 

Other composers of that era, including John Cage, Earle Brown, and Pierre Boulez, devised 
pieces that i nvite performers to determi ne some part of the structure of the musi c they are pl ayi ng. 
One can makean anal ogy between such open compositions and the mobilescul ptu res of A lexander 
Calder (figure 9.4). Calder's mobiles are fixed structures, but the parts can move relative to one 
another. The possible shapes are determined by the artist, but virtually infinite configurations 
are possible. The compositions of Stockhausen, Cage, Feldman, and others raised some serious 



294 


C hapter 9 



Figure 9.4 

Alexander Calder mobile: Untitled, 1942 (Cat. A15493). 

questions about the future of Western classical music. Potter (1971) writes, 

M any questions arise regard i ng formal openness: Is acomposer abdicad ng his responsibility when he reí i n- 
quishes formal control? Is formal determination an (the) essential task of the composer? Should a group of 
unordered musical segments be consi dered a single piece? W hat purpose is served by allowing the performer 
to order the material he performs? Isa listener awareof the formal openness? Should he be? Can a single per¬ 
formance (liveor recorded) be a self-sufficlent artistic statement, or is more than one performance necessary 
to expose the formal variation possible? (120) 

Thoughtheseareexcellentquestionsthatdeserveanswers,! haveaslightlydifferentagendatopursue 
here, which will nonetheless lead back to these kinds of questions, butfrom a broader perspective. 

9.6.2 ObjectiveChoice 

The time-worn approach to delegating choice to an objective process is to use dice or an urn of 
numbered bal Is: a ball is pulled "at random" from the urn, and the color or number on the ball is 
used to determine the choice. But more fanciful ways have been advanced specifically for com- 
posing music by chance. M auriti us Vogt (1719) suggested a method of composing by bending hob- 
nails into various shapes, then casting them on the ground and i nterpreti ng the ri se and fall of the 
music by the way the hobnails fell. William Hayes (1751) suggested that the composer spray ink 
from a brush onto music manuscript paper, then add note stems, staves, barlines, and all the rest 
according to signs drawn from a pack of cards. 

In fact, any objective process can be used whether it is random or not. E mploying chance as an 
agent of choice has a very long history in human culture. For example, wind chimes and aeolian 
harps harness random natural forces to create pleasing- daré I say musical?- sounds. 

Haveyou ever used a coin toss to decide what to do? This can help getyou unstuck if you are 
truly undecidedor real lydon’tcare about an outcome. But after the coin wastossed,did you really 




C omposition and Methodology 


295 


follow the coin's díctate, or did you back out and choosetheoutcomeyourself after alI? Perhaps 
thecoin'schoicejustdidn'tfeel comfortable?A chanceprocessisunlikelytomakethesamequal- 
ityofchoicesthata person would. Actually, thiscould begood or bad. 

On the plus side, chance decisions can help prevent a student of composition from being over- 
whelmed by the vastnessof possibleoutcomes. Guido may have had this point in mind. Or acom- 
poser might Iook to an objective process to suggest a novel di recti on to take to get past unconsci ous 
biases. The composer Herbert Brün (1970) wrote about using computers to provide a random 
choice-making element while composing: 

Whereas the human mind, conscious of its conceived purpose, approaches even an artificial system with a 
selective attitude and so becomes aware of only the preconceived implicadons of the system, the computers 
would show the total of the available content. Reveal ingfar more than only the tendenciesof the human mind, 
this nonselective picture of the mind-created system should be of significant importance. 

The composer David Cope (1996) reported that overcoming composer’s block was one reason he 
developed his ambitious Experiments in M usic Intelligence (EMI) system (seesection 9.24). The 
results of a chance process can provide a welcome new perspective that gets a composer out of a rut. 

On the minus side, puré chance has no regard for what a composer thinksor would prefer. Con- 
straining puré chance so that it does what composers want (mostly) occupies a great deal of the 
efforton automatic composing systems. 

9.6.3 The Role of Interestin M usic 

M usic is adelicate bal ancebetween what isfami liar and what issurprising. A nd the ultimatesource 
of surprise is chance. B utthis approach is not without risk. A truly random process such asfl i ppi ng 
a coin displays neither skill ñor taste at composing because it has no awareness of the music it is 
being used to create. Itdoes not, in and of itself, learn from its mistakesor makeinferencesabout 
its experiences. It does not favor particular outcomes, and as a consequence, its results have an 
undesirabie "wandering" quality. 

If we want to incorpórate chance into composition— and if we care about the interest of our 
I isteners— we must become students of interest and look for ways to increase the likelihood that 
the choices made on our behalf are interesting, because without interest there is no music, only 
noise. This problem was solved very cleverly in the antique automatic composing system 
described in the next section. 

9.6.4 M usikalische Würfelspiel 

Whereas Guido’s composing method was intended to be driven by human choices, a related tech- 
nique, M usikalische Würfelspiel, which aróse during theEuropean classical era, was intended to be 
driven by athrow of the di ce. The music engineering problem solved by this system washow to direct 
unregarding chance operations to make musically suitable choices and compose interesting music. 

In 1757 in Berlín, J ohann Philipp Kirnberger published Der allezeit fertige Polonaisen und 
Menuetten Komponist, roughly translated as "The Ever-Ready Composer of Polonaises and 



2% 


C hapter 9 


M inuets." Like Guido's, Kirnberger's intent was to provide a simplified means of composing 
music. In his preface Kirnberger States that the reader "will not haveto resortto professional 
composition." Although using thetechniquerequired no musical training, it required consider¬ 
able compositional ski II to create it in thefi rst place. C omposers of preemi nentstature, i ncl uding 
Wolfgang A. M ozartjoseph Haydn, and C.P.E. Bach, developed Würfelspiel techniques(Potter 
1971). 

Becausetheaesthetics of the European classical era were so strict, it was possi ble to construct 
a simple music-making game for composing minuets, trios, and other incidental works. The 
method consisted of applying the outcome of throwing dice (or spinning a spinner, or similar 
actions) to choosing which of several possi ble musical motives would be selected from tables of 
precomposed musical figures. A well-formed pieceof music intheclassical stylewould resultThe 
reason that chancedoes not cause the resulting musical composition towanderis because the com- 
positions are prestructured to be musically interesting by the master composers who set up the 
tables. 

Figure9.5 showsafragmentof Würfelspiel minuettriosattributed tojoseph Haydn (O'Beirne 
1968). Only thefi rst two phrases of two of the six original mi nuet vari ations areshown, enough 
to give an idea of how the method works. Variations a and b can be played as perfectly acceptable 
mi nuet tri os. B ut the composer eleverly arranged for al I variati ons to have the same harmoni c pl an 
and close-enough voice leading so thatothers could create new mi nuet trios by interleaving mea- 
sures from any of the variations so long as they are taken in order across the page. We can create a 
derivative variation, for example, by alternad ng measures of a and b, {a 1( b 2 , a 3 , b 4 , a 5 , b 6 , a 7 , b 8 }, 
which sounds like a pleasant minuet trio. 

A Itogether, there are six variations of 16 bars i n the ful I score, which by the rule of enumeration 
woul d mean there are 6 16 vari ati ons. H ow ever, because some of the measures are i denti cal i n some 
of the variations, there are actual ly "only" 940,369,969,152 enumerations. O’Beirne (1968) lays 


a 3 a 2 a 3 a 4 a 5 a 6 a 7 a 8 


¡pg 

£ 

C 


ü 

00 * 0 0* 0 ^^1^0 

1 


bi 

b 2 

b 3 

i r i r i f 

b 4 

b 5 

b 6 

b 7 

b 8 

í|3H 

Si 

s§ 

íl 

fe 

Ss 

¡s 

51 


Figure 9.5 

M usical diceextract, attri buted tojoseph Haydn. 

















C omposition and Methodology 


297 


outthecasefor Haydn'sauthorship in an articlethat publishes alI six of these minuettrios infull. 
Hewritesthat if attribution to Haydn iscorrect, "Onequestion remains: havewefound onenew 
Haydn Ítem? orsix? or— in view of theintended permutation possibilities—something morelike 
1,000,000,000,000 new Haydn trios!" 

There was one other ingredient in thetypical Würfelspiel setup: the measures were not laid out 
asobviously as in figure9.5.1 nstead, the variations werechopped up intoone-measurechunks and 
entered in an indexed table in random order. This served no purpose other than to obscure the 
underlying mechanism, to makethe process seem more "magical" to the user. 

Componium Diedrich Nikolaus Winkel (1773-1826), rightly the inventor of the metronome 
(see section 2.6.2), is credited as the first to construct an automated music composing machine. 
(Tiggelen 1987; Buchner 1956). Infact, itseemslikely thatwhat hedid wastoadaptelementsof 
Kirnberger's Würfelspiel idea to mechanical form. Winkel's Componium is basically a barrel 
organ, an orchestrion, like the ones used to accompany merry-go-rounds. These instruments 
encode musi c by pi ns protrudi ng f rom the surface of the barrel that key organ pi pes or other musi cal 
instruments to play as the drum rotates. 

U nlikea standard orchestrion, the Componium was equippedwith a second barrel (seefigure9.6). 
The first barrel encodes several variations of short musical works. A few barréis survi ve, contai ni ng 
works by M ozart, M oscheles, and Spohr. The second barrel, in conjunction with a complicated gear- 
ingapparatus, determines whichof the variations wi II beplayedfrom measuretomeasure, providing 
a large enumerative set of possi ble compositions. 



Figure 9.6 

Componium of D. N. Winkel. (Buchner 1956.) 




C hapter 9 


Originsof Würfelspiel Würfelspiel emerged fromthe era i nwhich probability calculus was pio- 
neered by BI ai se Pascal and work on permutations and combi nations was done by J acob B ernoul I i. 
Gerigk (1934) writes, 

This sort of musical game was in the air in the second part of the eighteenth century, though clearly regarded 
as entertainment only. This is, e.g., expressed by K irnberger, whom we should also regard as the father of 
musical literatureof this sort in the preface tohiscomposition of this kind (1757). Every game is afterall a 
mirrorof the ideas of the times: the rationalistic epoch considers the possi bil ity of mechanical composition. 

M any systems of this type were published, including one by Peter Weleker in London in 1775 
undertheamusing ti ti e, 71 Tabular System WherebyAny Person without the Least Knowledge of 
M usick MayCompose TenThousand D ¡fferent M inuets in the M ost Pleasing andCorrect M anner, 
which seemstofollow Kirnberger's lead (Kochel 1862). 

TurningtheTables Würfelspiel uses chance as an alternativeto personal choice for decisions 
we do not wi sh to make or cannot make oursel ves. B ut there are many other reasons to use chance 
as a source of choice. 

The American composer John Cage (1961) was well known for using chance techniques and 
purposeful silencein hiscompositions.Theway i nwhich heincorporated chanceoperationsinthe 
actof composition invited natural forces to speak directly through his music. Of course, the idea 
of appreciating the aesthetics of natural forces channeled through the arts did not origínate with 
Cage. We listen to wind chimes and aeolian harps for much the same reasons. Some forms of 
J apanese painting utilize imperfections in the paper to the same end. 

Foranotherexample,Santillanaand vonDechend (1969), intheirlandmark work Hamlet'sM ¡II, 
discuss several gamesfrom ancient times in which the choice of pieceto be moved in chess, for 
instance, was determined by a throw of the dice. Called "The Game of the Gods" or "Celestial 
War," these games are documented in texts dating from the fourteenth and fifteenth centuries in 
India and China. 

C hance as O ráele I n I ife, we are affected by natural forces that are beyond our abi lity to predi ct 
and thatappear tobe utterly random. We observe the effectsofourown willful actionsand presume 
by analogy that random natural events can beseen as the"wiII" of natural forces acting upon us. 
We externalize our personal will and project it by analogy onto a "cosmic will." If we ourselves 
deliberately generate chance occurrences such as by throwing coins, we can endow the outeome 
with prophetic valué because the chance occurrences are presumably in alignment with this same 
natural "cosmic will." Perhaps this idea explainswhy chance is the basisof oracular methods such 
asTarot card readings and the Chínese oracular text, the / Ching. 3 

Attherootoftheseoracularmethodsisthebeliefthatthechanceprocessesonwhichtheyarebased 
are sy nchronized w i th the "cosmi c will": the same forcé that determi nes outeomes i n our I i ves deter¬ 
mines the chance process used by the Oracle. The psychologistCarl J ung coined theword synchro- 
nicity, which hedefined as"meaningful coincidence." This is a purely descriptiveterm that denotes 
an association between an objective event and its subjective significanee; J ung did not imply a 



C omposition and Methodology 


299 


necessary causal association between an objective event and any personal subjective meaning. Of 
course, many peoplethrough theages have bel ieved thatthey (ortheir local soothsayer) knew how 
to interpret the synchronistic thread between an event and its subjective meaning. But interpreting 
an oracle requires a way of decoding the messagethat issupposedly implied by the "cosmic wi11 
As the histories of supplicants at Delphi can attest, this ¡s generally very difficult to do. 

H owever, even if we don't use an oracle in an i nterpretive way, we can sti11 consider that chance 
operations incorporated into an art form allow nature to speak to us through that art, and we can 
appreciate the message aestheticalIy even if we don’t claim to understand it. 

9.7 Randomness 

R andomness is I i teral ly i n the eye of the behol der. We can derive randomness from any natural pro- 
cess, such as the flood tides of the N ¡le, drawing numbered balIs from an urn, the motion of wind 
or waves, the distribution of ink splotches on a page, the motion of atoms near an electrode, or 
throwing dice. 

Heitor Villa-Lobos used theskylineof New York City tocreatethe melody for hiscomposition 
New YorkSkyline. John Cage composed Atlas Eclipticalis using astronomical charts. Charles 
Dodge used fluctuations of the earth's magnetic field to create a large work of electronic music 
titled The Earth’s M agnetic Field. 

A re the N ew Y ork sky I i ne, the contents of astronomi cal charts, and f I uctuati ons i n the magneti c 
field random processes? Aren't bui Idi ng heights in New York a function of the building codes? 
Isn’t the distri bution of stars a function of the laws of gravitation? T rué, but the central question 
is epistemológica!, not physical: can we determinea formula thatexactly characterizes thephe- 
nomenon? If not, it isa random process to theobserver. N ote that this i mpl ies there is no random¬ 
ness without an observen 

9.7.1 W hat C onstitutes Randomness? 

The crucial characteristic of useful random processes is that chance events must be independent 
of each other. B y independent I mean that even knowi ng a very large set of outcomes does not hel p 
us guess any other outcomes. I f the outcomes of a random process are absol utel y i ndependent, then 
the process is i nfinitely random and aperi odie. Such a sequenceconsti tutes an i nexhausti ble source 
of surprise and novelty. 

A random process can be viewed as distributed in timeor in space. For example, the location 
of stars in the night sky constitutes a very slowly changing random function of position, one that 
evolves over millions of years. M ore typically, we examine the sequential outcomes of a spatial ly 
localized random processlikeacointoss,whichweviewasafunctionoftime.Thisinturn suggests 
additional qualities of a random process: 

■ U niformdistribution Does the sequence enumérate al I possi ble outcomes? A re al I valuesmore 
or less equally likely, or are some regions favored over others? 



300 


C hapter 9 


■ Uniformity over ranges What are thepermutational characteristics of thesequence?Arethere 
patterns in the data— ranges of numbers that resemble each other in some predictable way? Are 
subsequent valúes correlated somehow with previous valúes? 

■ U niformity over frequency W hat is the rate of change of valúes in the sequence? Do the mag¬ 
nitudes change slowly or quickly? If magnitudes change quickly, the data form a jagged series of 
abrupt peaksand valleys, corresponding to high frequencies. If magnitudes change slowly, the next 
number i n the sequence won’t be very far from previ ous val ues, and the data form a smoother, less 
jagged curve, corresponding to low frequencies. 4 

9.7.2 Pseudorandomness 

Computers can only execute methods that are strictly algorithmic, and the effectiveness require- 
ment for algorithms rules out anything that depends upon unknowns; henee computers cannot be 
a source of truerandom sequences bydesign. If a Computer everdid anything genuinely random, 
itwould haveto go in for repairs. Nonetheless, mathematicians have spentafair amountof effort 
trying to develop computable sources of randomness. J ohn von N eumann (1963), mathematician 
and pioneer i n Computer Science, recognized thiscontradiction and iswidely quoted as having said, 
"Anyone who considers arithmetical methods of producing random digits is, of course, in a State 
of sin." This is a droll remark, coming as it does from a pioneer of deterministic methods of gen¬ 
erad ng random numbers. 

By the principie of independence, we only know that a sequence is perfectly random if it never 
repeats its choi ces i n whol e or i n part. B ut i n practice a sequence does not have to be perfectly ran¬ 
dom to be useful. It need only be "random enough" to surprise us. So randomness is essentially 
an empirical criterion that we use to characterize processes we can’t predict. 

A Ithough computers can’t generate puré random sequences, there are numeri cal techni ques that 
allow computers to generate number sequences that are "random enough" for practical use. How- 
ever, all such computer-generated sequences eventually repeat, so they are pseudorandom. I 
presenta simple buteffecti ve approach to generad ng pseudorandom numbers i n section 9.7.3, but 
a brief digression into polynomials is required first. 

Polynomials Wecan expressany number N as a polynomial of integers in base i), for instance, 
123 in base 10, written as 

123 = (3 x 10°) + (2 x 10 1 ) + (1 x 10 2 ). 

The ratio of some numbers in some bases produces an infinite polynomial expansión, such as 
j = (3 x 10°) + (3 x 10- 1 ) + (3 x 10- 2 ) + ■ ■ ■ = 3.33333.... 

Sometí mes a cyclic polynomial is produced, such as 
— = 1.857142857142 . . . 



C omposition and Methodology 


301 


Irrational numbers represented as polynomials in any base produce an infinitely noncyclic 
sequenceof digits. For i nstance, the irrational number rc = 3.1415927... shows no apparentpattern 
in its infinite polynomial expansión. New techniques areavailableto calcúlatearbitrary digitsof 
n with good efficiency and without having to know the preceding digits, making this a possi ble 
source of random valúes. 5 We can calcúlate successive random digits from the fractional valúes 
of an irrational number. Cyclic polynomials are quite easy to calcúlate, and we can generate 
sequences that are quite long and have good uniformity. 

Converting Polynomials to Digit Sequences Wecan convertany polynomial sequenceinto 
a sequence of digits as follows. For some radix base, b, let f be a fraction: 
f=a 1 b- 1 + a 2 b~ 2 + ■ ■ ■ +a n b~ n . All the valúes of a must lie within the radix, that is, they must 
sati sfy O < a i < b. (For i nstance, the deci mal System has radix 10, and so val ues must I i e between 
0 and 9.) If we multiply the fraction f by b, we will shift the first fractional digit, a v outof the 
fraction and into the units place: 

bf=a 1 + a 2 fr 1 + a 3 b~ 2 + ■ ■ ■ +a n b~ n+ \ 

In this way wehave isolated a 3 in the units place. If werepeatedly multiply the resultof the pre¬ 
vi ous step by b, we push the next digitout of the polynomial 's fractional val ue i nto the units place. 
Forexample, let f = 0.2615, and b =10: 

0.2615 ■ 10 = 2.615 
2.615 ■ 10 = 2 6.15 
26.15 .10 = 26 1.15 - 
261.5 ■ 10 = 261 5.0 4? 

We can use this technique to extract successive digits to form random number sequences. 

9.7.3 L inear C ongruential M ethod 

Equation (9.1) shows the linear congruential method forgenerating random numbers, introduced 
by D. H. Lehmer in 1948 (Knuth 1973, vol. 2): 

x n+1 = ((ax„+b)) c , n>0. (9.1) 

The notation ((x))„ means"xisreduced modulo n." The resultis the remainderafter i nteger división 
of x by n (seeappendix A). 

E quati on (9.1) i s a recurrence relation because the result of the previous step (x„) is used to cal¬ 
cúlate a subsequent step (x n+1 ). It is linear because the ax + b part of the equation describes a 
straight linethat intersectsthey-axisatoffset b with slopea. Congruence isa condition of equiv- 
alence between two integers modulo some other integer, and refers here simply to the fact that 
modulo arithmetic is being used. 



302 


C hapter 9 


Forsuccessivecomputationsof x, the output will grow until it reaches the valuéc. Whencis 
exceeded, the new valué of x is effectively reset to x - c by the modulus operation. A new slope 
will grow from this point, and this process repeats endlessly. 

The result can be quite predictable depending upon the valúes of a, b, c, and x 0 . For instance, 
if a = b = x 0 = 1, and c = °°, an ascendi ng strai ght I i ne at a 45° si ope i s produced. H owever, for other 
valúes, the numbers generated can appear random. 

In practice, the modulus c should be as large as possible in order to produce long random 
sequences. On a Computer, the ultímate limit of c is the arithmetic precisión of that machine. For 
example, if the Computer uses 16-bit arithmetic, random numbers generated by this method can 
have at most a period of 2 16 = 65,536 valúes before the pattern repeats. 

Thequality of randomnesswithi na period vari es dependi ngon the val ueschosen fora.x, and b. 
M uch heavy-duty mathematics has been expended choosing good valúes (Knuth 1973, vol. 2). 
For 32-bit arithmetic, Park and M iller (1988) recommend a = 16,807, b = 0, and c = 
2,147,483,647. 

T he I i near congruenti al method i s appeal i ng because once a good set of the parameters i s found, 
it is very easy to program on a Computer. The LCRandom () method returns a random number by 
the linear congruenti al method each time it is called: 

//Constants from Park and Miller 

Constant Integer a = 16807; // a, b, c and x are global constant valúes 

ConsLanCijlísLegcr b = 0; 

Constant Integer c = 2147483647; 

Integer x = 1; // x stores the valué produced by 

// LCRandom between i r.vocaticns 

Integer LCRandom(){ 

x = Mod(a * x + b, c); // 

Integer r = x; // 

Z£''M < O) II 

Return(r); 

} 

The parameters a, b, and c are constant (time-invariant) system parameters. Parameter x is ini- 
tialized in this example to 1, but it can be initialized to any other integer. The valué of a * x + b 
iscalculated, the remainder is found modulo c, and the result is reassigned to x. 

W hile the val ue of x islessthan c, x grows I inearly. When theexpression a * x + b eventual ly 
produces a valué beyond the range of c, then x is reduced modulo c. The random effect of this 
method comes from the surprisingly unpredictablesequenceof remainders generated by the mod¬ 
ulus operation, dependi ng upon careful choice of parameters. 

The calculation of x ranges over all possible positive and negative integers smaller than the 
valueof ±c. But it is general ly preferableto constrain its choices to a range. To make this con¬ 
versión easier, we forcé the result to be a positive integer. 


update x based on its previous valué 
x may be positive or negative 
forcé ; £he result to be po|pjSÍYe 




C omposition and Methodology 


Seeding the Random Number G enerator U ni i kethe natural sources of randomness, LCRandom () 
wiII always produce the same sequence with the same initial parameters. Different sets of pseu- 
dorandom sequences can be generated by varying the initial valué of x, as with the following 
function: 

SeedRandom(Integer s) { x = s; } // set global variable x feo seed s 

This function allowsusto set the initial valueof x. If we initial izex toa parameter suchas the cur- 
rent time in seconds from some fixed moment, then we start at a different place in the pseudoran- 
dom cycle each time (although, of course, this is finite, too, because the sequence length is 
necessarily limited). 

The I i near congruential method is si mple and effi ci ent, but it i s hardly the best source of random 
valúes. Even ignoring thefactthat it repeats, itsuniformity isnotwonderful. Knuth (1973, vol. 2) 
cautioned, "Random number generatorsshould not be chosenat random." Forsuperiortechniques, 
see Press et al. (1988,210). However, this method isvery simple to implementand hastheadvan- 
tage over natural random processes of providi ng the same pseudorandom sequence if seeded with 
the same valúes. 

Random Real Numbers The LCRandom () method returns integers between 0 and c. It is 
straightforward to map its output to any range of Real valúes between an upper bound u and a 
lower bound l: 


Real Random (Real ¡L r Real U) { 
Integer i = LCRandom(); 
Real r = Real(i); 
r = r/Real(c); 

Return (r * (U - L) *■ :|) ; 


0 get a random integer valué 
// convert it to a real valué 
// sealé it to 0.0 <= r < 1.0 
// scale it to the range L to U 


First, we use LCRandom o to get a random integer. Recall that LCRandom () forcesthe result 
to be positive. We promote its random integer result to Real and store it in r. N ext, we divide it 
by c so its range is o,o <= r 4$¡§. o. Finally, we scale it by the difference between u and l, 
and add l, so that the random valuéis bounded above by u and below by l. Thatway wecan get 
a random result from a particular range of valúes that wecan sti púlate. 

Random Integer Numbers Scaled to an Arbitrary Range We can adapt the Random () 

function to return integers within a specified integer range. When a real valué is converted to an 
integer, we trúncate (discard) the fractional part, leaving theinteger part. Forexample, 

Real x = 3.14159; 

Integer i=Integer(x); 

Print ill;';; 

prints 3. Truncation isequivalentto th efloor function, written [3.14159 J = 3 . 




304 


C hapter 9 


Here is a method to generate integer random valúes over an integer range. 


Integer Random(Integer L, Integer 
Real rL =« 

Real rU = U + 1.0; 

Real x = Random( rL, tU); 

Return( Integer(x)); 


// convert L to Real 
// convert ll'lfeo Real, add 1.0 
// get a real random valué 
// ret^tít .ilt as an integejj 


Notethat I added 1.0 to the upper real boundary. Truncadon of the random result necessitates 
slightly increasing thetop end of the range of choice. For example, in order to choosea valué in 
the i nteger range 0 to 9, we must generate a random real val uex that I ¡es i n the range 0.0 < x < 10.0. 
Thisgivesan equal chance of obtaining an integer in the range 0 to 9. 


9.8 Chaos and Determinism 

Dynamics i s a f i el d of classi cal mechani es that studi es how forcé affeets moti on of materi al bodies 
through time. A system i sdynamical if itssubsequent Statedependsupon itscurrentand previous 
States. A flying airplane is an example of a dynamical system. Supposex n represents the current 
position of an airplane, and x n+1 represents its next position. Then the relation between these two 
positions, 

x n+1 = f(x n ), (9.2) 

¡sdynamical because its subsequent State (x n+1 ) isafunction fof its previous State (x„). Equation (9.2) 
is another example of a recurrence relation because it shows the relation between subsequent 
valúes of a function. 

A dynamical system may depend upon its current inputs as well as its past outputs. For exam¬ 
ple, the airplane's position will al so depend upon the operation of its Controls and theforces of 
the ai r. 

A system is deterministic if every cause has a unique effect. The uniqueness requirement goes 
from cause to effect but not necessarily from effect to cause. For example, the function y = x 2 Is 
deterministic becauseycan be predi ctedgi ven x, butonecan’tnecessarily deducexgiveny because 
there may be two choices. 

Because the LCRandom o method isa deterministic way of generati ng whatappearto be ran¬ 
dom valúes, it is a chaotic system. The term chaotic has been taken by physicists to mean a deter- 
ministic system that appears to be random, such that it is impossible to make long-range 
predictions about its behavior. Although it repeats over largespans of time, a simple system like 
LCRandom () can behaveso unpredictably in theshort run that it would be very difficultto deduce 
its rather simple generati ng structurefrom its output alone. 

A chaoticsystem isonethatweknow to be deterministic but that appears to be random. A truly 
random system is nondeterministic. Therefore, pseudorandom Systems are chaotic, not random. 





C omposition and Methodology 


305 


9.8.1 Sensitivity to I nitial C onditions 

A key characteristic of chaotic Systems such as LCRandomo is their sensitivity to initial condi- 
tions. Even the smallestchangesto variable x, therandom seed, can lead over ti meto total ly dif- 
ferent behaviors of the System to a point where the differences far overshadow the si mi lariti es. 

Natural examplesof chaotic dynamical Systems inelude the earth'satmosphere and the vibrations 
of virtual ly all sources of musical sound, such as the scrape of a bow on the strings or the turbulent 
flow of airfrom theplayer’slipsoverthefippleof aflute. Small differences in ¡nitial conditions can 
be amplified by such Systems to such an extentthatany error in measuring the initial conditions can 
render any long-rangeforecastof System behavior wildly inaccurate, even if there is nofurther dis- 
turbance to the System. The weather from day to day is never exactly the same. Notes played on a 
flute, though they may sound alike, are never exactly the same. Ourearsgloss over these differences, 
hearing sound categórica! Iy. Butifwewishtounderstand the precise mechanism of a dynamical Sys¬ 
tem so as to accurately predict its behavior over time, the initial conditions must be known exactly. 

By usi ng more accurate measurements on such natural Systems, we can reduce but not el i mi nate 
measurement uncertainty. But only if we measured with infinite precisión— an impossible 
task—wouldwe be ableto elimínate all uncertainty, and only thenwould the i nitial conditions al low 
ustoobtain utterly predictable behavior from a model of a dynamical system. The implicit Western 
scientific assumption has been that wecan continué to shrink the uncertainty of a dynamical sys- 
tem’s outeome by measuring its initial conditions with evergreaterprecision.Thus,weassume that 
more nearly perfect predictions could be made by supplying more precise initial conditions. 

However, through the work of the mathematician Henri Poincaré (1854-1912), we know that 
there are Systems whoselong-term predictions are not improved by increased precisión of the i ni¬ 
tial conditions. Whilestudyingthegravitational influencesofthreebodiesupon eachother, hedis- 
covered that under certain circumstances, even if the initial uncertainties are infinitesimal, the 
predicted outeomes can be so different that the deterministic prediction is real Iy no better than if 
the prediction had been made by chance. This is how sensitivity to initial conditions istied to the 
appearance of randomness. 

To ¡Ilústrate this point, Edward Lorenz (1972), another pioneer in chaos theory, wrote a paper 
titled "Predictability: D oes the F lap of a B utterfly's Wi ngs in Brazil Set off a Tornado inTexas?" 6 
Unlikethe debate aboutthe number of angels that can dance on the head of a pin, the answer to 
Lorenz's question (yes) has dramatic consequences for the limits of epistemology. M any if not 
mostof thebasic systems in life, such as the weather, are chaotic dynamical Systems, and weare 
unable to predict the long-range behavior of any such system whose initial conditions we don't 
know with infinite precisión. A las for the human condition, this explains why we are blind to the 
future until it is upon us. This is the glass cage that confines our Faustian desi res. 

9.8.2 ComplexityTheory 

Complex dynamical systems such as clouds can be seen from a reductioniStic perspective as 
merely disorganized col lections of water droplets. However, these systems al so have an evident 



306 


C hapter 9 


self-organizing flow. For example, theshape of acloud will grow and transform ¡n a mannerthat 
reveáis an emergent ¡nternal structure. We can summarizethis by saying that unconstrained com- 
plex dynamical Systems have a natural and innate tendency to move toward complexity. 

A complexsystem contains elements that are both differentiated (specialized or compartmental- 
ized) and integrated (connected or unified) on all levelsof scale. Itscomplexity comes aboutthrough 
the interaction of ¡nternal and external constraints. For example, the ¡nternal constraints of a cloud 
are the molecular forces of the air and water, and the external constraints are the winds that drive 
it. The i nternal constrai nts of the brai n are the synaptic connections, and the external constrai nts are 
the flow of Information from outsideeventsand other minds. The ¡nternal constraints of music are 
the criteria of musical perception and cognition, and the external constrai nts are the flow of expec- 
tation from musician to Iistener and the return flow of interestfrom listener to musidan. 

W hen a system isnotin complexity, ittends toward monotony (saturated integration) orcacoph- 
ony (disintegration). 

What benefit does complexity provide to a system? Why do clouds not makegeometric pat- 
terns i n the sky or devol ve i nto utter randomness? T he reason i s that w hen a system moves toward 
complexity, it is in its most stable, adaptive, and flexible State. When a brain is stuck in linear 
thinking or lost in confusión, it may not thrive. When music is not in complexity, we stop lis- 
tening. 

These characteri stics of sel f-regulad o n are cornerstones of healthy responsivenessto Ufe and 
mental well-being. Flow appropriatethatstabiIity, adaptabiIity, andflexibiIity arealso hallmarks 
of successful music. Flow i nteresti ng it is that these qualities emerge through thedynamic inter- 
play of differentiation and integration on all levels of scale in a musical work. Flow natural it 
seems to think of music as embodying these core principies of stamina and health. Here is the 
foundationforamusic theory thatweaves together Information theory, chaos theory, complexity 
theory, cognitive psychology, and nonlinear dynamics in a way that honors music's therapeutic 
cap acides. 

9.9 Combinatorios 

The discussion now shifts to more pracdcal concerns. If composing is about methodologies of 
choice, it is worth wondering about the rangeof choices that various musical Systems provide to 
thecomposer. These questions are studied by thefield of combinatorios. 

Asthenamesuggests, combinatorics is thestudy of how sets can becombined in patterns. This 
i ncludes enumerad ng all the possi ble permutad onsof a set. Somemusical questions opened up by 
combi natorics i nd ude the number of orderi ngs of musical moti ves wi thi n a scal e, the total number 
of diatonic scales, and the number of possi ble mel odies of a certain length. In theearly twenheth 
century, composers of the second Viennese school associated with A rnold Schoenberg borrowed 
¡deas from the mathemahcs of combinatorics to construct a radically different kind of music 
than had ever been heard before. This brief study of combinatorics leads to an overview of their 
techniques. 



C omposition and Methodology 


307 


Figure 9.7 

Guido's method expressed as a tree of possibil i ti es. 

9.9.1 Enumeration 

If we examine the possi ble outcomes of G uido's method, we see that there are three choices at each 
step. Foraone-vowel text there are3 possi bleone-notemel odies; foratwo-vowel text, there are 3 2 mel- 
odiesoftwonotes; and for an/V-vow el text there are 3 W melodies. Thusthenumberof possi blemelodies 
grows exponentially for longer texts (figure 9.7). This demonstrates the principie of enumeration: 

If there are m outcomes ofoperation 1, andthen n outcomes o f oper ation 2, thecomposltenum- 
ber of outcomes of operation 1 followed by operation 2 ¡s m times n. 

For instance, for Guido's method, m and n are both 3. So for step 2, the number of outcomes is 
3 ■ 3 = 9 (see figure 9.7). 

E numerad ng the possi bi I i ti es of somethi ng means itemizing all possi bl e outcomes. Counting all 
suchoutcomesisto enumérate them. For i nstance, how many 12-notemelodiescan beformedfrom 
the dodecaphonic scale? By the principie of enumeration, there are 12 possibilities for the first 
note, and then 12 possi biliti es for the second note, and then ... through 12 steps. So the answer 
is 12 -12-12 — = 12 12 = 8.9 x 10 12 , which isnearly ninetrillion.Thesetof melodiesineludes, 
for ¡nstance, the ascendí ng and descendí ng chromatic scales, the first 12 notes of Antonio Carlos 
J obi m's O ne N ote Samba transposed to all 12 pi tches, and the fi rst 12 pi tches of every part of every 
symphony, and opera that has ever been written, or could ever be written, ¡na dodecaphonic scale. 
There are more 80-note chromatic melodies than there are subatomic particles in the universe 
(assuming there are 10 80 or so such particles). I suppose this makes it pretty unlikely that future 
composers will run outof material toworkwith! 

9.9.2 Permutation 

The principie of enumeration answers the question, Flow many total outcomes are possi ble? The 
principie of permutation answers the question, How many unique orderings are possi ble? 

For ¡nstance, how many ways are there to orderthesequence, a, b, c? Wefind out by swapping 
theelementsaround until werun outof unique orderings. L et's use the method whereweswap the 
last two elements, then the previous two elements, and so on (figure 9.8). We can create six per¬ 
mutad ons this way before the reordering procedure recreatesthe original ordering. So there can 
be six permutadons of three things. But how could wediscoverthe number of possi blepermuta- 
tions without having to reorder and inspect them? 


Start = 3 o 

lstnotechoice = 3 1 

2dnotechoice = 3 2 

Nth notechoice = 



308 


C hapter 9 



Figure 9.8 

Permutation of three objects. 


a b cly 
b^c^a 3 
c^y'b J_ 
a d^c 

Figure 9.9 

Example of circular permutation. 

Let’s find the solution through another musical example. How many unique 12-tone rows 
are there in the set of dodecaphonic scales? RecalI that when we enumerated all the 12-note 
melodies, we could pick from all 12 pitches at every step. That led to melodies with repeated 
notes. But a tone row is defined as a melody of 12 nonrepeating pitches, so we must exelude 
whatever pitch ischosenfrom subsequentchoices. Wecan choosefrom 11 pitchesforthesec- 
ond note, 10 for the third, and so on. Otherwise, the process is just like enumeration. Thus 
the numberof uniqueorderingsis 12 • 11 • 10 • 9 • ■ ■ ■ = 12! =4.7 x 10 8 , orslightly morethan 
470 mi Ilion 12-tone rows in the dodecaphonic system. As one might expect, there are substan- 
tially fewer permutations of 12 tones than there are enumerations of them. Thus there are n! 
permutationsof n objects. Going back to the first example, there are 3 -2-1 = 6 permutations 
of three objects. 

9.9.3 Circular Permutation 

A circular permutation, or rotation, occurswhen theelementatoneend islopped off and attached 
to the other end circularly (figure 9.9). There aren circular permutations ofn objects. Rotation can 
be to the left or right by one or more places. 

Here is a method that rotates a Iist by an arbitrary number of places either to the right or left: 


Rotate (IntegerList Reference f, Integer n, ''.ÍÜÉteger 0) { 

..á. Iffad.(n, Length(f)); //con$trafár rotation to length of 



C omposition and Methodology 


309 


ílSifceger x = f [i]; //store f [ii ¥©r- use after recursi^^;. 

Sjiífi < fasiigth (f)-1) // reached the end? 

Rotate í§ 4-1)!|; // no, cali Rotate () recursively 

// epUtinüa from he re when the ir#cjaé t Sion unwinds 

;ííiteger pos = PosMod(i+n, Length(f)); // Índex list modulo--IfcS; lengtjá 
f[pos] = x; // assign valué of x saved above 


Thisexampleuses recursion to perform itsfunction. Ittakesthreearguments,alistf,thenumber 
n of positions to rotate by, and i, theindexforwhereto begin, usually setto zero. If n is positive, 
the Iistis rotated to the rightthat many places; if n is negative, the Iist is rotated that many positions 
to the left. Thefirst step isto constrain n modulo the length of the I i st so thatany amountof rotation 
can behandled. 

The declaration integerList Reference f requires a bit of explanation. We want 
Rotate o to modify the listthat issupplied. Butfunctionsareordinarily supplied only with cop¬ 
ies of the valué of the actual arguments(seeappendixB, B.1.22). The word Reference inthedec- 
laration tells M usim at that it should supply Rotate () with the actual variable named when the 
function is invoked. Thus changes to the list handed to Rotate o wiII persist after the function 
is finished. 

Weneed to makesure the variable pos stays within therangeof valid list elements, which nat¬ 
ural ly suggests the use of Mod (), except that Mod () can return negative valúes. B ut list indexes 
must be strictly positive. So we use a function cal led PosModo, which returnsonly the positive 
wing of modulo valúes (see appendix A.6). 

Table9.3 shows left and right rotation by variousamounts for a list l defined as 

'ÍJ^tegerList. #§§»'!#' {O, 1 , íí, 3, 4, 5}; 

9.9.4 Partitioning 

Supposewewanttocreateal2-tonerow consi sti ng of the 12 pitchespartitioned into three motives 
of six notes, three notes, and three notes each. H ow many different motives could there be? C learly, 


Table 9.3 

Left and Right Rotation 


Rotate(iL, -n, 0) 


Rotate(iL, n, 0) 



n =-6 



1 2 

2 3 

3 4 



n=3 



3 4 

2 3 

1 2 


2 3 

1 2 


3 4 

2 3 




310 


C hapter 9 


thetotal numberof uniquetonerowsisstil112!. Butthenumberof uniquemotivesshould befewer 
than 12! because the tone row space is divided into groups. 

Here's a possi ble way of doing it: 

1. Assign pitchesto the motives. Forinstance, assign pitchesC-F to thesix-note motive, pitches 
F|- G# to thefirstthree-note motive, and pitches A-B to thesecond three-note motive. 

2. O rder the pitches i n the fi rst motive. 

3. O rder the pitches i n the second motive. 

4. O rder the pitches i n the third motive. 

We don’t know how many outcomes are possi ble for step 1 yet, so let’s cali this the unknown, x, 
for now. 

Steps 2, 3, and 4 are ordering operations. Because ordering operations are permutadons, there 
are 6!, 3!, and 3! orderingsin steps 2, 3, and 4, respectively. 

N ote that steps 1- 4 enumérate the steps of creati ng a tone row accordi ng to the motivi c arrange- 
ment. Remembering the rule for enumeradon, that means the total number of unique tone rows 
would be 6!3!3!x. But since the total number of uniquetone rows is 12!, we can equate these 
two pieces of information, yielding 6!3!3!x = 12!. Solving for x yields the number of unique 
motives: 


12 ! 

6! 3! 3! 


55,440 


(e, 3 2 3 ) 


(9.3) 


Thus 12 pitches can be partitioned into 55,440 motives of three subsets of six, three, and three 
notes. The rightmostterm in (9.3) shows how partitioning isnotated. Itisread as "the number of 
ways 12 objects can be partitioned into groups of six, three, and three." 

Generalizing from this particular solution, wecan express partitioning N objects into p subsets 
of r p elements as 


Partitioning (9.4) 


9.9.5 N Objects/? ataTime 

Suppose we select seven pitches from the 12 semitones and order them into a seven-note melody. 
Flow many such melodies are there? Flow many ways are there to select seven notes out of 12? 
Wecan think of this asa kind of partitioning because ordering the melody partitions itinto eight 
subsets: the first note, the second note, and so forth, up to the seventh note. The eighth subset is 
the unchosen pitches out of the original 12, which is 12 - 7 = 5 pitches. So we can use the parti¬ 
tioning formula, (9.5), as follows: 

= 3 ' 991 ' 680 ' 




C omposition and Methodology 


311 


that is, 3,991,680 melodies of 12 pitches taken seven at a time. Because 5! = (12 - 7)! we can 
expressthesamething this way: 


Thisisconvenient because 12 represents thetotal numberof elementsand 7 representsthesizeof 
the partition. Abstracting based on thisexample, in general there are 

permutadons of n objects taken r at a time. 

9.9.6 Combinations 

How many seven-note scales are there i n the 12 pitches of the dodecaphonic System? This is like 
taking/V unordered objectsR atadme. I tseemsreasonabletoexpect that there wi II be fewer scales 
of seven pitches than melodiesof seven pitches because melodies can repeatanotewhereas scales 
cannot. 

Wedividethe pitches into two groups: seven chosen pitches, and fiveunchosen pitches. By (9.4), 
there must be 

12 ! 


7!(12-7)!’ 


or 943 such scales. This is a rather large number in comparison to the dozen or so of those com- 


monly in use. So, in general, a pardhoningof ^ ^_ r j possibleoutcomesequals — 
outcomes. 

This is used commonly enough to have its own notad on and is usually written 


actual 


r'.(n-r) 


!-(?} 


Taking n Unordered Objects r ata Time (9.6) 


9.10 Atonality 

Combinatorics can guide us to a deeper understanding of the compositional aims of atonal music. 
I examine this in some depth because composers of atonal music pioneered the use of explicit 
compositional methodology to a degreethat had not previously been attempted in music. Thus 
atonal music isafruitful field of study for compositional methodology. 

In theearly partof the twentieth century, Arnold Schoenberg, Alban Berg, and Antón Webern, 
the composers of the so-cal led second Vienneseschool, developed a compositional method based 
on note patterns that contain all 12 pitches (see section 3.16). 



312 


C hapter 9 


Absolute: 10 9 1 11 5 7 3 4 0 2 8 6 

Reíative: 0 11 3 1 7 9 5 6 2 4 10 8 


m2:D#-E P5 

M 2: C -D m6 

m3: M 6 

M 3 mi 

P4 M 7 

TT 


Figure 9.10 

Tone row for Schoenberg's Fantasy for Violin and Piano, Opus 47. 


In Schoenberg’soriginal method, each composition was organized around a particularordering 
of the 12 pitch classes of thechromatic scalethat he called a tone row (seesection 2.4). No pitch 
appears more than once within the row, so none of the 12 pitch classes is favored. Schoenberg's 
ideawastousethismethodto removeany vestígesof tonal harmonyfrom his music, henceto com¬ 
pose atonal music. 

For example, the row shown in figure 9.10 appears in Schoenberg's Fantasy for Violin and 
Piano, Opus 47. The pitch classes can benumbered intwo ways: 

■ Absolute pitch numbers, indexed by chromatic half steps above C 

■ Relative pitch numbers indexed by half steps from thefirst pitch in the row 

Relative indexing hassomeadvantagesthatwill becomeevidentlater, so I use that from now on. 

Thebasic method isas follows. Each time a new toneis needed in the composition, thecom- 
poser picks the next pitch class in the row, circling back to the first pitch class when the list is 
exhausted. 

Si nce the pri mary ai m i s to remove tonal references, and other considerati ons are secondary, the 
way inwhich each pitch class isprojected i nto the composition is left up to thecomposer. The pitch 
classes can be freely applied to any octave, assigned to any instrument, and given any desi red 
dynamic level, rhythmic valué, or performance articularon. Some pitch classes den ved from a row 
might be used to generate a musical line while others might be used to spell a chord, for example. 

The tone row and the plan for how it is to be projected i nto the composition are sepárate steps 
taken pri orto actual composing. This planning stageis called precomposition. The following sec- 
tions descri be some of the theory of sets and sequences upon which atonal music theory is based. 

9.10.1 Series 

In general, a set is an unordered collection of any size. A series is a particular ordering of a set. 
A tone row isa series based on a set of pitch classes (Forte 1973). A tone row may contain all or 



C omposition and Methodology 


313 


5 1 

6 0 

7 11 

8 g 10 

Figure 9.11 

Chroma. 

ti 3 2 

5 1 

:S§ o 

II 11 

8 9 ffy' 

Figure 9.12 

The set {4, 6, 7,10}. 

part of the availablepitch classes. Sincethedistinguishing characteristic of each pitch class is its 
chroma (see figure 6.5), the pitch classes can becharacterized circularly (figure 9.11). 

There are over 470 miIIion 12-tonerows in thedodecaphonic System (seesection 9.9.2). If we 
add to this al I rows of less than 12 tones, there are a great many more. B ut many of them share char- 
acteristies that make them seem related. H ow can we tel I them apart, and how can we characterize 
thei r si mi larities? 

For example, consider the set of pitch classes {4,6,7,10}. By octave equivalence as well as by 
circular permutation, wecould relate this setto the sets {6, 7,10,4}, {7,10,4,6}, and {10,4, 6, 
7}. These sets are equivalent except for their starting points (figure 9.12). They ar eequivalent 
under circular permutation. It would be nice to give them a ñame that reflects thei r equivalence. 
Wecould ñame the whole col lectionafterjustoneof them, butwhichof these permutationsshould 
we consi der to be the principal one? 

Sincethe disti nguishing characteristic of a row istheplacementof different-sized intervals, let’s 
arbitranly make a rule that the normal form of a set lists the pitch classes in ascending numeric 
order (correspondíng to counterclockwise motion around the cirelé) in the intervallically most 
compactform. A set is most compact whose interval sizebetweenthefirstand last pitch class is 
smallest, modulo 12. For example, with the preceding set permutations, and using the notation 
((x)) 12 to denotex modulo 12, we have 

{4, 6, 7,10} 

I ((10 - 4)) 12 = 6 


{6, 7,10, 4} 
((4 - 6)) 12 = 10 


{7,10, 4,6} 
((6 - 7)) 12 = 11 


(10, 4, 6,7} 
((7- 10)) 12 = 9 





314 


C hapter 9 


Becausetheorder {4,6, 7,10} yields the smallest difference (6) between first and last pitch, this 
is the normal form for this set. The ñame for this set of permutations is then [4,6,7,10]. (The set 
ñame is written with brackets and commas without spaces.) This isthe way to nameall sets that 
are equivalent under circular permutadon. 

If multi pl e orderi ngs tie for compactness, we need another rule to break the ti e. I n thi s case let's 
make a rule that the normal form for a set i s the one mostcompactto theleft. So, forexample, for 
the set {0, 3, 6, 7, 9}, 



{0,3,6,7,9} 

{3,6, 7,9,0} 

{6, 7,9,0,3} 

{7, 9, 0,3,6} 

{9,0,3,6,7} 

Tied 

((9 ■ 0))h = 9 

((0- 3)) 12 = 9 

((3 - 6)) 12 = 9 

((6 - 7)) 12 = 11 

((7 - 9)) 12 = 10 

Tied 

((7-0)) 12 =7 

((9-3)) 12 = 6 

((0 - 6)) 12 - 6 



Mostcompact 

((7-3)) 12 = 4 

((9 - ©te ~3 




so we ñame this set [6,7,9,0,3]. 

If a set is so regularthatthereisnotiebreaker, then pick the orderi ng that begins with thelow- 
est number. Forexample, {2, 6,10} has permutations {6,10, 2} and {10,2,6}. Ñame this one 
[ 2 , 6 , 10 ], 

Let’s denote the operations required to normal ize a set as N(x), wherex is the set to be normal- 
ized. Then, for example, we can write 


SetClasses In the previous example, several sets were seento berelated in an algorithmic way 
(by ci rcular permutation), so we grouped them together under the ñame of one of the sets that had 
aparticularly elegantform: [2,6,10]. A set c/ass is a namedgroupof sets that are equival ent under 
specific condi ti ons. In the example, the sets 

{{ 2 , 6 , 10 }, { 6 , 10 , 2 }, { 10 , 2 , 6 }} 

form asetclassnamed [2,6,10] of sets that are equival ent under circular permutad on.There are many 
waysin which sets can berelated intosetclasses, butthefollowing relationsare particularly useful. 

Transposition The sets {4,6,7,10} and {9,11,0,3} are related by transposition because if we 
transpose each pitch class in the first set up by five semitones (modulo octave equivalence), it 
equals the second set. Therefore, these two sets are equivalent under transposition (see section 
2.5.4). Wecan define transposition as 

T n (x) = ((x + n)) 12 , Transposition (9.7) 

wherex isthe pitch class to be transposed, and n isthe number of degrees by which to transpose 
it. To transpose up by 5 we can write 

T 5 ({4, 6, 7,10}) = {9,11, 0,3}. 



C omposition and Methodology 


315 


To transpose down by 4, 

T_ 4 ({4,6, 7,10}) = {0, 2,3,6}. 

There are 12 uniquetranspositions (i ncludi ng thezeroth) of 12 pitch classes. We col lectively ñame 
them after the transposition with the smallest initial valué, so this set class would be named 
[ 0 , 2 , 3 , 6 ], 

I nversion We can create a mi rror i mage of a set by subtracti ng each pitch class from the number 
of elements it contains. Wecan define inversión as 

/(x) = ((A/ - x)) w , Inversión ( 9.8) 

whereW is the number ofavailable pitch classes, in this case, 12. The modulo operation isneeded 
to handle the case wherex=0. Forexample, the sets {4, 6, 7,10} and {8, 6,5, 2} ar eequivalent 
under inversión because 

((12 - 4)) 12 = 8 
(( 12 - 6 %* 6 
((12 - 7)) 12 = 5 
((12 - 10)) 12 = 2 . 

So we can write /({4, 6, 7,10}) = (8, 6, 5, 2}. Because of this relation, we can also classify 
{8, 6, 5, 2} as a member of the set class [4,6,7,10]. 

It is easy to visualize the effect of inversión by imagining a line bisecting the circle of pitch 
classes horizontally (figure 9.13). Pitch classes relatedby inversión are mi rror opposites above and 
bel ow thi s I i ne. T hi s f i gure shows the ori gi nal form {4,6,7,10} bei ng i nverted by ref I ecti on across 
the bisecting line into {8, 6, 5,2}. 

Retrograde Sets that are reí ated by having their members in reversed orderar eequivalent under 
retrogression. If R(x) denotes the retrograde of a setx, then wecan write, forexample, R ({4,6, 7, 
10}) = {10,7, 6,4} and also classify {10, 7,6,4} as a member of set class [4,6,7,10]. 


- 0 - 


Figure9.13 

Inversión. 



316 


C hapter 9 


Prime Form AII the sets that are equivalent under circular permutation, transposition, retro- 
gression, and inversión can usefully be grouped into a single set class because they can all 
be derived from each other using these operations. But which set should we select as the 
primogeniture— the"mother of all sets"— i n its class? The standard convention isto choosethe 
set that 

■ Is ¡n normal form 

■ Is mostcompactto the left 

■ Is transposed so that itsfirst pitch class starts at zero. 

This set is the prime form of the set class. All other members of the set class are derived from the 
prime form. 

Wefind the prime form of {4, 6, 7,10} as follows. M ake its first pitch class start at 0: T_ 4 ({4, 
6,7,10}) = {0,2,3,6}, and make sure the result is in normal form, which it is. We must compare 
this with itsinversion to seewhich is morecompact, so:/({4,6,7,10}) = {8,6,5,2}, and its normal 
form is A/({8, 6, 5,2}) = {2, 5, 6,8}. 

F inally, we must transpose it, so T_ 2 ({2,5,6,8}) = {0,3,4,6}. Since {0,2,3,6} is morecompact 
to the left than {0, 3,4, 6}, we ñame this set complex [0,2,3,6]. 

Interval Classes Sofarwehaveexaminedjustthepitchclasscontentof sets, but a set's i nterval 
contení is what provides its musical signature. The ¡nterval content is the set of ¡nterval classes 
between all pitch classes of the set. For example, the ¡nterval classes for the set {4, 6,7,10} are 

Interval 4-6 4-7 4-10 6-7 6-10 7-10 

Distance 2 3 6 1 4 3 

This set of i nterval s, {2,3,6,1,4,3}, is the i nterval lie signature of this set These ¡nterval distances 
appear in every member of the set class, giving all members of the class the particular sound of the set 
class. To borrow an example from tonal harmony, thetriads sound liketriads because they sharethe 
same ¡nterval distances: majorthird, minor third, and perfect fifth (seesection 3.10.2). Si milarly, aug- 
mented and diminished triads are distinct to our ears because of their characteristic ¡nterval distances. 

We can characterize the i nterval lie profile of a set by making a histogram (a simple ordered 
tally) of the number of intervals it contains. I n the preceding example, therearetwo instancesof 
¡nterval distance3, and all the restof the intervals appear only once. If wethink of the various i nter¬ 
val classesasmakingupasetoforthogonal dimensionsin interval space, wecanconsider the num¬ 
ber of repeti tions of each ¡nterval as the length of a vector in that i nterval's dimensión. The 
combinad on of all thesevectors makesa single, unique multidi mensional vector characterizi ng the 
i nterval lie content of the set. For example, the ¡nterval class vector for the set shown above is: 

Interval class 1 2 3 4 5 6 

Quantity 112 10 1 



C omposition and Methodology 


317 


so theinterval class vector is [1,1,2,1,0,1], which istheprofileof its uniqueintervallic contentand 
henee the signature of its unique sound. 

Cardinality The number of unique pitch classes in a set is its cardinality. The máximum car- 
dinality in the dodecaphonic System is, of course, 12, and there is only one class in this set: 
the aggregate set, containing all 12 pitch classes. The mínimum cardinality of a set of interval sis 
2. Cardinalities between 2 and 11 havethefollowing Latín ñames: diad, trichord, tetrachord, pen- 
tachord, hexachord, heptachord, octachord, nonachord, decachord, and undecachord. 

Complement Relation If a set class contai nsfewerthan 12 pitch classes, the pitch classes that are 
left out are its complement set class. For example, the whole-tone scale has two versions: {0, 2,4, 
6,8,10} and {1,3,5,7,9,11} that are complement set classes (see section 2.5.7). 

9.11 C omposing F unctions 

In mathematics a function is composable with another if it can be the other’s argument. For 
instance, if y = f(x ), and z = g(y ), then z = g( f(x)) is th ecomposition of g with f. Considerthe 
definitions f(x) =x + 1 ,and g(x) -x 2 . If z = g( f(x)) ,then z = x 2 + 2x+ l.To be composable, 
the range of f must be a subset of the domai n of g. 

Following Ada Lovelace's train of thought quoted at the beginning of this chapter, let's use 
M usimat to create a short excerpt of atonal music using function composition. 

9.11.1 Precomposition 

The process of composing atonal music is typically divided into two parís. 

■ Precomposing: assembling the musical materials 

■ Composing: applying the assembled materials in a design 

M usimat already has a number of data types and operadons, but a few more are needed: 

■ To representpitchesassymbolswith integer valúes: 

Integer C = 0, Cs = Db = 1, 2, Ds = Eb = 3 - . ., B - 11; 

■ To represent motives as lists: 

JfttegerLi^tta = {F, F, G, A}; IntegerEa#tb = {F, A, G}; IntegerListc = {F, E}; 
TritegerLisL d = {Bb, A, G, F}; TptegdíList e = {E, C, E, F, F}; 

■ To combine motives and concaténate lists: 

JÍJítegerList y = Join(a, b, a, c, a, d, e) ; 

(y is defined as the list of pitches of the tune "Yankee Doodle.") 



C hapter 9 


■ To transposea pitch set: 

.'JlítegerList tíf anspose (Integekfilst Integer t) { 
For (Integer $ = 0; : i. «¡- Length(L) ; + 1) 

L [i] = 3aoíi1t[i] 

Refttjfn(L) ; 

} 

■ To inverta pitch set: 

jMltegerList. .¡JSlvert (IntegerLi;sfc, L) { 

For(Integer i = 0; i < Length(L); i = i + 1) 

L[i] = Mod (12 - L[i], 12) ; 

Return(L); 


■ To take the retrograde of a set: 

IntegerList retrograde(IntegerList L) { 

Integer n = Length(L);; 

IntegerList R = L; // make a new list as long as.. % 

For (Integer i = 0; i < *= ti. í 1) 

R [ i ] «RÍ[n - i - 1!; 

Return(R); 


9.11.2 TheSetComplex 

Using thesetools, wecan create a matrix containing the primeform, inverse, retrograde, and all 
transpositions of any row, called the set complex. The purpose of these transformadons is to gen¬ 
érate variants that are related to the original intervallie structure of the prime row, to be used as 
material in developing compositions. 

Matrix issimply a two-dimensional grid, or list of lists, all of the same length. The individual 
elementsof Matrix can beaccessed by extendíng the Índex operator [...]. Thefirst operand isthe 
matrix,thesecondistherow position, and thethird isthecolumn position. (Whether roworcolumn 
comes first is arbitrary. The following order is called row/column order.) 


m = 0 

1 


For example, for this matrix, m[ 0 ] [ 0 ] == a, m[ o] [i] == b, m[i] [o] == c, and 
m [ i ] [i] == d. Thefollowing isa method for creating a set complex. 11 basically copiesthe pri me 


0 1 
A B 
C D 





C omposition and Methodology 


319 


form to the zeroth row, then copies the i nverse form to the zeroth col umn, then for each other cel I i n 
thematrix sumsthe correspondí ng row and column valué modulo the length of the prime form: 

Matrix setComplex(IntegérList prime) f 
Matrix M; 

Integer len — Length(prime); 

IntegerList inverted = invert(prime); 

For (líftlaeger $ t¡; i < len; i «* 'i':#' 1) { 

M[0][i] = prime[i]; 

M[i] [0] =»jlMverted. [i] ; 

} 

For (Integer i = 1; i < len; i -s»; Jb 1) { 

For (Integer j = 1; j < len.; 3 =» j ! + 1) { 

M[i] [j] = Mod(M[i] [0] + M[0] [j] , len); 


Return(M); 


To demónstrate thesetools, table9.4showsthesetcomplexforSchoenberg’sOpus23 #5, Five 
Piano Pieces. The prime set {C*, A, B,G,G», F», A», D, E,D», C, F} ¡sshowninnumericform along 
thetop row. Primerowsareread leftto right, retrograderowsrightto left, ¡nverserowstopto bot- 
tom, and retrograde ¡nverse rows bottom to top. 

This completes the precomposition phase. N ow it'stimeto look at methods to traverse the rows 
created with the preceding techniques to generate a composition. 

9.12 Traversing and M anipulating M usical M ateríais 

H avi ng arranged the materi al s from wh¡ ch a compositi on i s to be derived, we now consi der meth¬ 
ods to traverse thesematerials i n structured ways. Following areafew ways rows can betraversed 
tostructuretonal oratonal melodies, rhythms, dynamics,articulation, instrumentation, oranything 
el se that can be parameterized. 

9.12.1 Deterministic Serial M ethods 

This section demonstrates some methods for iterating through tone rows. They are deterministic 
because their outcomes do not rely on chance. They are serial because they itérate through lists. 
Their use is not limited to tone rows but can be extended to arbitrary lists of data. 

Thebasic idea isto supply a listof musical materialsto a method thatwill selectand return list 
elements one at a time in a chosen order. 

Cycle This method iterates a sequenceeitherforwardorbackward. Itcaneitherselectsuccessive 
elements or skip through the list. When itreaches theend of the list (either end), itstarts over at 





320 


C hapter 9 


Table9.4 

Set Complex for Schoenberg's O pus 23 #5 


Prime 



Retrograde 


the other end. Its inputs are 

■ The listto traverse 

■ The previous position in the list 

■ Whetherto moveforward (prime) or backward (retrograde) 

Its output is the next element in sequence based on its previous position in the list. Asa side 
effect, it updates its position in the list. 

If it traverses the list forward, it returns to the head of the list when it goes past the tail. If it 
traverses the Iist in retrograde, it returns to the tail of the list when it goes past the head. 

In the following code example, setting íjte.to i moves forward one element every time 
cycie o iscalled, and setting inc to -i moves backward one element ata ti me. Setting incto 
any other val ue ski ps through the Iist by that amount, wrappi ng around at the ends. 

Integer cycle(IntegerList L, Integer Reference pos, Integer inc) { 
Integer i = PosMod(pos, Length(L)); // compute current Índex 
pos = PosMod(pos + Length(L)); // compute Índex for next time 

Retoba(L[i]); 




C omposition and Methodology 


321 


The pos argument keepstrack of the position in the list. We wish to delegateto cycie o the 
job of managing the list position, so we declare pos as a Reference argument. Thus, when 
cycie o updates pos, the correspondíng actual argument is changed. (If pos were not a 
Reference variable, any changes cycie o madeto its valué would be lost when it returns (see 
appendix B, B.1.22). 

Here’san exampleof invoking cycieo: 

IntegerList L = {10,: 11 , 12}; 
í-ftteger myPos = 0; 

Integer n = 2 * Length(L) -1; // go 1 less than two times through list 
j|pr (lisfegé:#0; i<n; i = i + l) 

Print (cycle (L, myPos^.^j) ) ; // 1 - fiorward directiva; 

Print("myPos=", myPos); 

This program prints io, 11 , 12 , lo, 11 . Last, it pr¡ntsmyPos= 2 , provi ng that cycie o is 
changing the myPos parameter. 

Palindrome We can itérate a sequence in prime order until the last element in the sequence is 
reached, then itérate the sequence retrograde unti I the f¡ rst element i n the sequence i s reached, then 
repeat. 

Jfjteger palindrome (IntegerLijgfc. L, S&teger Reference pos, J$j$teger 
Reference inc) { 

Jr.Leger 'curPos = pos; 

Integer x = cycle (L, pos,, É#ic) ; 

(carpos != pos) { 

inc = inc * (-1); // change direction 

pos = curPos; 


Return(x); 


This method calis cycle o to do most of its work. Like cycle (), this method updates pos, 
but it al so must update its i ncrement argument, inc, because whenever it hits the end of the list, 
we want it to reverse the direction of traversal rather than start over. The extra work done by this 
method i s to change the i ncrement and reset the position w hen either end of the I i st i s reached. H ere 
i san exampleof invoking palindrome o. 

.IntegerList:. t» = ?L2},| 

Intege r myPos = 0; 

..Integer mylnc = 1; // can be ahy posfM^e or negative integer 
fer (|¡B$ege£ 0; i < 2 * Length (L)^ ,+ 1) 

pfetíit;(palindrome (L, po^jp. ) ; 








322 


C hapter 9 


printsio, 11 , 12 , 12 , n, ío. Note that the end of the list is printed twice. This makes it a 
so-calledevenpalindrome.ltwouldbeanoddpalindromeifitwereio, ti, 12 , 11 , 10 . Itisleft 
as an exercisefor the reader to adapt paündrome () to generate odd palindromes. 

Permutation Itérate the supplied sequence in prime order until exhausted, then permute the 
entire row by inc steps and repeatfrom the beginning. 


Integer permute(IntegerList L, Integer Reference pos. 


Integer Reference count. 


Integer curPos = pos; // 

Integer x = cycle(L, pos, 1); // 

count = count +1; // 

If (count == Length(L)){ // 

count =0; // 

pos = curPos # inc; // 


Retuxh(x); 


Integer inc) { 
save current position 
update pos and get list valué 
increment counter 

have we output, J| Ítems from list? 
reset count' 

permute positidit for next t-iroe' 


Here is an example of invoking permute o. 


Integer inc = -1; 

Integer pos = 0; 

Integer perm = 0; 

For ($j$$ege& 4 = Q-f i < 3 * Length(L); i++) 

Ptaiit (permute (L, pos, perrn^,- fnc) ) ; 

printsio, n, 12 , n, 12 , ío, 12 , io, 11 . Becauseinc = - 1 , i t ski ps back one pl ace i n the 
row every time. Thetrigger for itto skip is when it has output as many elements as are in the list. 

Transpose Thedodecaphonic pitch classes are not tied to any octave. In order to realize music 
from a tone row, its intervallic content must be transíated to actual pitches of the musical scale. 
One way to do this is to supply a pitch offset that transposes across pitch space (i.e., without lim- 
iting it just to the range of pitch classes). 

Integer transpose(Integer p, Tnrecer off)(i 
Return (p + off); 


TheC major diatonic scale in thefourth piano octave can then begiven asfollows: 

IntegerList Cmaj = {C, E, F, G, A, B}; // define C maje# scale 
For (i = 0; i < Length(Cmaj); i = i + 1){ 

L[i] = transpose(L[i], 4 * 12); // shift all up 4 octaves 
Print(Pitch(Cmaj)); 





C omposition and Methodology 


323 


Table9.5 

interpTendency Example 


Row A 0 2 

f = 0.00 o 2 

f = 0.25 3 4 

f = 0.50 6 6 

^ = 0.75 g 8 

f = 1-00 12 10 

Row B 12 10 


8 

8 


10 

10 


6 

4 


12 


This prints {Cn4,-8jp4, Ert4, Kn4, Gr.4, An4, Bn4} . 

Interpolated Tendency M ask We can produce a new row that is a mixture of two other rows. 
Let’s have a variable that vari es continuously between 0.0 and l.Osuch thatwhen it is 0.0, the out- 
put row isexactly the same asthefirstrow; when it is 0.5, the output is exactly halfway between 
thefirstand second; and when it is 1.0, the output is exactly the second row. For example, suppose 
the first pitches in each row are 3 and 9, and the interpoladon parameter is 0.5. Then the expected 
result would be 6 because 6 lies halfway between the two valúes. If the interpolaron parameter 
were 0.0, we'd select 3, and if it were 1.0, we'd select 9. 

Table 9.5 shows what happens if row A = {0, 2, 4, 6, 8,10,12} and row B = {12,10, 8, 6, 4, 
2,0}, and f is set successively to 0.0,0.25,0.5,0.75, and 1.0. W hen f = 0, we select the prime row, 
when f= 1.0, we select the retrograde row, and in between, we select weighted mixtures. 

We use unit interpolaron to find intermedíate valúes that lie a certain distance between two 
known points. If u istheupper bound and / isthe lower bound and f isa control parameter in the 
unit distance from 0.0 to 1.0, then 

y - f ■ (u -1) +1 Unit Interpolaron (9.9) 

sets y to a valué cióse to u if 0 «f; it sets y to a val ue cióse to / if f 1; it sets y to a val ue exactly 
halfway between u and / if f = 0.5. Hereis the function for unit interpolation: 

Real ü<ni ! fe|&terp (Real f, Integer 1,. íyiteger u) { 

Return (f * (u - 1) + 1) ; 


This is a Real function because f must be a Real to take on fractional valúes. When we use it 
as follows, we convert the Real result back to an integer by rounding: 


Integer interpTendency( 

Real f, 

t.3§fe¡t: Él,, 'Jfeteger Reference pos Mi 
L2,. Integer Reference pos2. 


// factor ranging from 0.0 to 1.0 
// list 1 and its positsion parameter 
// list 2 and its position parameter 



324 


C hapter 9 


Integer 'áigí // ¿rr.Oüht by which to adjust position, 

) { 

Integer x = cycle(Ll, posl, inc); 

Integer y = cycle(L2, pos2, inc); 

Return(Integer(Round(unitlnterp(f, x, y)))); 


Thisfunction can perform acoupleof neattricks. First, wecan havethefunction return exactly li 
or l 2 by setting f = o.oorf = 1 . 0 ,respectively. By setting f = o.5,wegettheaverageofthe 
two rows. By gradually changing thevalueof f from 0.0 to 1.0, wemutateLi, transforming itgrad- 
ually until itbecomesL 2 . Also, thelengthsof li and L 2 need not bethesame. If li hasalength of 
5 and L 2 a length of 6, it will take 5 • 6 iterations before the pattern repeats. Both lists use the same 
increment, but redesigning thisto useseparateincrementswould provi de foreven more possibil i ti es. 

Linear Interpolaron Linear interpolaron allows us to map a rangeof valúes so that itcovers 
a proportionately wider or narrower range. Figure9.14 shows linear interpolaron from the range 
1- 4 on the I eft bei ng mapped to the range 3- 9 on the right. T he val ue 3 on the I eft corresponds by 
linear interpolaron to 7 on the right. Linear interpolaron maintains the linear proportions of the 
two number I i nes: 3 is two-thi rds of the way from 1 to 4, and 7 i s two-thi rds of the way from 3 to 9. 

Linear interpolaron is a slight generalizaron of unit interpolaron, as follows. If x max is 
the upper bound and x min is the I ower bound, and x is a parameter i n the rangex min < x < x max , then 

y =- ■ (y max - y min ) + y min Linear Interpolation (9.10) 

sets y to a position within the range y min <y <y max that is proportional to the position of x within 
its range. Here'sthe definition of linear interpolation in Musimat: 

Real linearlnterpolate( 

Real x, // valué ranging from xMin to xMax 


x y 



Figure 9.14 

Linear interpolation. 




C omposition and Methodology 


325 


Real xMin, 
Real xMax, 
Real yMin, 
Real yMax 


// minimum range x 
// máximum range of x 
fi target minimum range 
// target máximum range 


Real a = (x - xMin) / (xMax - xMin); 
Real b = yMax - yMin; 

Retiren (a. * b + yMin); 


Wealso can use linear interpolation to map an entirefunction to a different range. Wedo so by 
appl y i ng I i near i nterpol ati on to every poi nt on the f uncti on. F or exampl e, we can seal e a chromati c 
melody to occupy a wider or narrower tessatura as follows: 

IntegorT. ¡ st stretch (IntegerList L,' •ínteger yMin, Integer yMax) { 

’ 'Sálteger xMin = Min(L); // jihd. the list's «útílííftpj| : . 

tCñteger xMax = Max(L); //VÍJind the list's maXinttim 

For (í/steger 0; ; $ < Length (L) = i + 1) 

L[i] = linearTnterpo)ate(T. | i |, xMin, xMax, yMin, yMax); 


Return(L); 


Forexample, invoking stretcho with these arguments 

IntegerList x = stretch(L, 24, 47 )• 

will scalethe row to cover a two-octave range and offset it upward by one octave. If the input is 

IntegorT,¡ st L = {0, 8, 10, 6, 7, 5, 9, 1, 3, 2, 11, 4}, 

then x wiII be {24, 40 , 44 , 36, 38, 34 , 42 , 26, 30 , 28, 47 , 32 } . 11 can also be used to com- 
press rows. With the same input, stretch (l, o, 5) will produce { o , 3, 4, 2 , 3, 2 , 4, o, 

1, o, 5, 1}. 

9.12.2 Deterministic RhythmlcTechniquesof J oseph Schillinger 

J oseph Schillinger, a refugeefrom Soviet Russia, becamea prominent music theorist in N ew York 
in the 1930s and counted among his students thefamousjazz musicians George Gershwin and 
Benny Goodman. In his book TheMathematical Basisof theArts (1948) hewas highly critical of 
arttheory, writing, "It is time to admitthat esthetic theories havefailed in theanalysis as well as 
the synthesis of art. T hese have been unsuccessful both i n i nterpreti ng the nature of art and i n evol v- 
ing a reliable method of composition." 

H e was looki ng to establ ish a scientific theory of art and to put practical methods i nto the hands 
of artists, giving them a mathematician's visión of the nature and extent of their domain. This, he 





326 


C hapter 9 


hoped, would help free musicians from the deadening weight of musical tradition, much as 
Schoenberg hoped atonal composing techniques would do the same. 

Schillinger envisioned development of "Instruments forthe automatic composition of music," 
including rhythm, melody, harmony, harmonizadon, counterpoint, and timbre. His nameforsuch 
Instruments wasM usamaton. He collaborated with León Theremin to createa device hedubbed 
the Rhythmicon, which he used for "the composition and automatic performance of rhythmic 
patterns." H e I ooked toward the use of such devices by anyone, not requi ring special trai ni ng, "suit- 
ablefor schools, clubs, public amusement places, and homes." 

H e wrotea large, deeply flawed, two-volumetome, The Schillinger System ofM usi cal Composition. 
Some of his ideas seem banal, others are incomprehensible, and he expressed his musical formal i sms 
using a pseudomathematical notation of his own design, accompanied by often cryptic explanations 
that usually served to mystify the reader. He criticized the work of famous composers such as 
Beethoven, rewrote compositions of J. S. Bach to "improve" them, and in general displayed an arro- 
gance that undercut his message (Backus 1961). Nonetheless, for the intrepid, there are interesting 
i deas i n his work, particularly regarding rhythm, an otherwise quite neglected subjectin music theory. 

Hebegan with the observadon that music isatime-based artwherecontinuoustimeisbroken 
into pulses. Schillinger's ¡dea isthat rhythm arises through the "interference" of two sources of 
pulse. Forexample, considertwo harmonically related pulse generators (figure 9.15). The major 
generator produces three pulses in the same time as the mi ñor generator produces two. Schill¬ 
inger calIed the resultant pattern pulse interference, although this is a confusión because 
the result is actual ly the product of the two functions, whereas interference implies addition 
(see section 7.7). 

AII the pulse interference patterns that can beproduced by theratio of any two integersform an 
inversely symmetrical pattern around their midpoint. Transitions in the pulse interference func¬ 
tions represent rhythmic stress points in the resulting rhythmic pattern. Their interpretation 
depends upon the musical context. For example, the interference pattern shown in figure 9.15 can 
be interpreted trivial ly in any of the three ways shown i n figure 9.16. 


Pulses 



Figure 9.15 

Pulse interference. 



C omposition and Methodology 


327 


Figure 9.16 

Schillinger's pulse interference patterns. 



- 


— 

— 

— 

- 

— 

— 

- 

- 

— 



/ 

\ 










A 


j 

3 

i 

£ 




— 

- 

- 

— 

J J J ] J J J | J |{ 

J 



j 



= 


\ 

1 

= 

= 

Ú 

Jz: 










: 

■ 

c 

r 

- 

- 

— 

— 

- 

1 

- 

n 

- 



The melody risesand fallsas the function in the grid risesand falls. 
The rhythm starts a new note on each transition. 


Rhythm 


Figure 9.17 

Generating a melody with Schillinger's interference patterns. 

Such patterns can be applied to many musical contexts. For example, we can create a melody 
using the pul se interference pattern shown i n figure 9.15 as a trigger function to select a pitch from 
anotherfunction representing pitch displacement (figure 9.17). The function shown in thegrid is 
an arbitrary shape that determines the melody; the function labeled Rhythm is an independently 
generated i nterference pattern that determi nes the rhythm. A pplyi ng the i nterference pattern to the 
melody shape produces the sequence of notes shown to the right in figure 9.17. 

The pulse ¡nterference pattern is projected across thex-axis, and the diatonic scale is projected 
across they-axis. N otes are placed where the transítions in the rhythmic pattern intersect the pitch 
displacement function. The interference pattern determines the note’s duration. The composer 
John M yhill adapted this technique in his 1965 composition Scherzo a Tre Voce for computer- 
synthesized tape alone (A mes 1967). 

9.12.3 Representi ng M usic with F unctions 

The basis of Schillinger's compositional ¡dea is to map an arbitrary curve to musical notation by 
quantization (see volume2, chapter 1). Of course, the process works in reverse as welI: the grid 
i n figure 9.17 can beused to generate the correspondí ng pitch curve and rhythmic function of any 
piece of notated music. M athews and Rossler (1968) developed a graphical language for repre¬ 
senti ng scores of computer-generated sounds that uses this approach. They represented music with 
continuoustimefunctions thatwerequantized to obtain pitch and discretized to obtain time. 

M athews produced an interesting demonstrad on of the flexi bilí tyof this approach forcomposing. 
Hebegan by generating pitch and rhythm functionsfortwo traditional tunes, theEnglish military 



C hapter 9 


anthem The Britísh Grenadiers and the American tune When Johnny Comes Marching Home. 
Then he created a new melody by performing linear interpolation on the two sets of functions. 
W hen the i nterpolation parameter was set at 0.0, the method produced TheBritish Grenadiers, and 
set at 1.0, it produced WhenJ ohnnyComesMarchingHome. I n between, one heard somethi ng that 
sounded I i ke a mutated combi nati on of both. I n his exampl e, he vari ed the i nterpol ation parameter 
gradual ly from 0.0 to 1.0, with the result that the synthesized melody first resembled Grenadiers, 
but y ohnny gradual ly emerged from the chaos i n the middle and took over. Though it is graceless 
as a musical étude, M athews's effort nonetheless is a startling demonstration of how malleable 
music can be under these kinds of transformadons. 

9.12.4 Nondeterministic Serial M ethods 

Deterministic methodsproducethesameresultevery timethey are presented with thesameinputs. 
The methods discussed in this section rely on randomness, so they are nondeterministic methods. 

Sampling without Replacement Wecan generatearandomly selected 12-tonerow,forexample, 
by putting 12 balls in an urn, each marked with oneof thechromatic pitch classes, and draw them out 
one ata ti me without replacement, thereby guaranteeing that no pitch class ischosen morethan once. 

Random ( o, ii) returnsarandom integer between 0and 11 withequal probability. Butitcould 
return the same val ue multi pl e ti mes, so we must keep track of which pitch el asses have been cho- 
sen toensurethat it eventual lypicks one of each. This function takes one argumenta, determi ni ng 
thelength of therow. 

IntegerList randomRow(Integer N) { 

ÍJjtegerList %; // keep track S:f';j?étches chosen se ¿sat 

ÜSjtegerList M; // used ’í# build up random. row 

Jnteger $ 4 , 


// set all list e 1 ornen es zero, whi ch means "unused" 

O n t £ * i »- Ti {L [ i 1 = 0;} 

// build up M, marking off elements in L when they are chosen 
i = 0; 

While (i < N) { 

Integer x = Random(0, N - 1); // returns integer random valué 

If (L[x] == 0) { // hasn't been chosen yet? 

L [x] = 1; // mark íáfe "used" 

M[i] = x; // save result 

i = i + 1; // increment control variable 


Return(Y); 



C omposition and Methodology 


329 


Note that the second loop keeps repeating over and over until Random () has finally selected 
all n pitch classes. Itthen returnsthe newly created 12-tonerow inM. H ere ¡san examplerow cre- 

ated by randomRow () : 

{ 0 , 6 , 2 , 9 , 7 , 5 , 4 , 10 , 8 , 3 , 4 ^ 11 }; 

Every pitch class is represented exactly once. 

Shuffle We can create a random permutadon of a row rather as one woul d shuffI e a deck of cards. 
If we distinguish between the cards and their position in the deck, shuffling consistsof swapping 
the positions of all cards a pair ata time. First, we need a way to swap the position of two cards 
in the deck. We can swap the position of two elements in integerList like this: 


JtfitegerList swap (XftiegerList I,, 
Integer x = L[to]; // 

L [to] = L[from]; // 

L[fromJ = x; // 

Return (L); 


Integer Integer to) { 

save target valué 
swap from —> to 
swap to —» from 


To shuffle an enti re deck of cards (or row of pitch classes), we vi siteach positioninthelistfrom 
first to last in order and swap the card at each position with a card at a randomly chosen other 
position. Because we use Random () to choose the position of the other card to swap, the "other" 
position can be any position in the deck, including the currently selected position; thus we may 
occasionally swap a card with its own position, leaving it where it was. H owever, in a subsequent 
step, that card might be chosen to be swapped elsewhere. 

IntegerList shuffle(IntegerList L) { 

IntegerList M = randomRow(Length(L)); // elements to swap 
For (fnfceger ${•% 0; ífc < Length (L) 1) { : 

yjfjíteger j = M[i]; 

L = swap(L,j||^. j); 

} 

Return(L); 

} 


The first step istogenerateanew row with randomRow o, which isstored in integerSSystM. 
Successive valúes of i and successive elements of Mgive theindexes of the elements in l that are 
to be swapped. Suppose we have 

i - {0, 6, 2, 9, %. 5, 4, 10, 8, 3, 1, 11}; // source row 
M = {S, 1, 0, 4, 6, ”7 + 9, 3,. 10, 8, 11, 2}; // row created in shuffle 

Then each row intable9.6showstheintermediatevaluesof Las its elements are being swapped. The 
pattern startsout like this: swap the valué in position 0 and the valué in position 5; swap the valué 







330 


C hapter 9 


Table9.6 

A n E xampl e of Shuffling a Set 


0 1 2 3 4 5 6 7 8 9 10 11 



in position 1 with itself; swap the valué in position 2 and the valué in position 0; swap the valué in 
position 3 and the valué in position 4; and so on. The result is that every elementof theinput row is 
swapped randomly with another element, butthere’sachanceitmight beswapped with itself. 

Random Tendency M ask We can use a row to specify an upper boundary and another row to 
specify a lower boundary, and then pick a pitch in this range. Wecan pick any pitch in therange, either 
the median pitch ora random pitch oreven all pitches, depending upon whatwewantto useitfor. This 
example returns a random value lying between two rows that act as fences to Iimit the random range. 

Integer randTendency(IntegerList LÍ, Integer Reference posl, 

rr;UeqcrI.ist |¡lp- líjteger Referer.ee pos2, Integer inc) { 
integer x = cycle (Ll r . posl, inc); 

Sjteger y = cycle(L2, pos2, inc); 

If !:x < y) 

Return(Random(x, y)); 

E.i se 


Return(Random(y, x)); 



C omposition and Methodology 


331 


For example, if li and L 2 are as shown in the following table, the valúes in the middle row are 
random valúes chosen from between. 


Row L1 

0 

8 

10 

6 

7 

5 

9 

1 

3 

2 

11 

4 

Output row 

0 

1 

8 

6 

7 

5 

5 

3 

5 

4 

9 

7 

Row L2 

4 

0 

2 

10 

11 

9 

1 

5 

7 

6 

3 

8 


9.12.5 Serialism 

Schoenberg and hisschool wereamplifying musical trends of thei r ti me to deconstruct tonal expec- 
tation and key-centeredness i n E uropean art musi c. R ows and thei r treatment were chosen to defeat 
thetendency to hear tonal centeredness of any kind. Functional harmony was banished; even the 
too frequent repetition of a pitch was taboo lest it lend a tonal center to the music. 

Butitwould beadisserviceto Schoenberg and hisschool to imply thattheirmusicfollowed adecon- 
structionist agenda to the exclusión of all el se. They offered the i nterval I i c structureof the row and the 
organizaron of setforms as the new ligatures holding their music together. Perleand Lansky (1981) 
write, 

Perhaps the most important influenceof Schoenberg's method is not the 12-note idea in itself, but along with 
itthe individual conceptsof permutad on, inversional symmetry and complementation, i nvari anee under trans¬ 
formaron, aggregate construction, dosed systems, properties of adjacency as compositional determinants, 
transformations of musical surfaces through predefined operations, and so on. 

B ut deconstructi oni sm, once set i nto moti on, rarel y stops unti I i t has devoured everythi ng. Some 
composers of the post-World War II era observed vestiges of other traditional techniques in the 
music of Schoenberg and Berg. They noted that Schoenberg and Berg treated the 12-tone row as 
a theme to be developed, a practice that harked back to the classical technique of theme and 
variations— thematicism. They idolized the work of Schoenberg's pupil Antón Webern because he 
eschewed thematic development, building up compact, jewel-like compositions from as few 
as three notes. For example, in his Concertó for Nine Instruments written in 1934, all pitches 
are derived from the simple motive B-B^-D (prime form) and its retrograde, inversión, and 
retrograde-inversión. H is systematic treatment of pitch, rhythm, dynamics, and articulation was 
taken by theseyounger composers as a model for a new form of music. 

The composer Olivier M essiaen in France extended the 12-tone pitch-ordering technique of 
Schoenberg’s school to all other parameters of music, although he was working with modal pitch 
structures, not 12-tonerows(M essiaen 1942; Drew 1954/1955). Inspi red by Webern and M essiaen, 
other composers, i neluding PierreBoulez in France and M ilton B abbitt i n the United States, adapted 
M essiaen's ideas back to atonal practices, and totally organized music, or serialism, was born. 
According to Stuckenschmidt (1969), "Serial techniques are essentially a systematic transference 
of Schoenberg’s 12-tone technique to elementsof musical sound other than pitch.” 

This idea interlocked with two others. J ust as thetones in a 12-tone row were decoupled in sig- 
nificance from each other, the serialist composers decoupled all parameters of the musical note 



332 


C hapter 9 


from each other. Pitch, register, tone color, and dynamic level became independent. J ust as all 
tones were used in a 12-tone row, the serialist composers employed the entire available range 
of every other musical parameter—high to low, loud to soft, fastto slow, brightto dull— without 
preference. 

The dodecaphonists observed the tonal equival ence of the equal-tempered scale and sought to 
construct a new musical aesthetic that reflected this equality. To do so, they developed a 12-tone 
method that deconstructed tonal expectation and key-centeredness. The notion of tonal equiva- 
lence was extended by the serialists to project a uniform proportionality between all musical 
parameters and all combinations of musical parameters. 

Stuckenschmidt (1969), who witnessed the premiers of the E uropean serialist composers in the 
1950s, wrote, "The impression made by all these works, even on a listener who had read the com- 
mentaries beforehand, was one of chaos" (214). 

The composer Gyorgy Ligeti (1965) wrote, "Now that hierarchical connections have been 
destroyed, regular metrical pul sati onsdispensedwith,anddu rati o n s, d eg rees of I o u d ness, an d ti m- 
bres have been turned overto the tender mercies of serial distri bution, it becomes i ncreasi ngly dif- 
ficult to achieve contrast" (16). These compositions often projected a static quality, a musical 
equivalent of alphabet soup (see section 9.15 for why these effects occur). Ligeti (1965) summed 
it up: "Serial music is doomed to the same fate as al I previ ous sorts of music; at birth it al ready har- 
bored the seeds of its own dissolution" (14). 

9.13 StochasticTechniques 

With every musical parameter now serially ordered, there was even less familiar structurefor lis- 
teners to rely upon to orientthemselves in the music. The composer lannis X enakis (1955) criti- 
cized serialism as follows: 

Linear polyphony destroys itself by its very complexity; whatone hears is i n reality nothing butamassof notes 
in various registers. The enormous complexity prevents the audiencefrom following the intertwining of the 
Unes and has as its macroscopic effect an irrational and fortuitous dispersión of sounds over the wholeextent 
of the sonic spectrum. There is consequently a contradiction between the polyphonic linear system and the 
heard result, which is surface or mass. 

Echoing the same sentiment, the composer Gottfried M. Koenig (1970) wrote, "Thetroubletaken 
by the composer with series and their permutations has been in vain; in theend itisthestatistical 
distri bution that determines the composition." 

Believing that the listener experienees only the statistical aspeets of serial music, these com¬ 
posers reasoned thata better approach would beto compose di rectly using probabilistic instead of 
serial techniques. Xenakis (1955) writes, 

This contradiction inherent in [serial] polyphony will disappear [and] what will count will be the statistical 
mean of isolated States and of transformations of sonic components at a given moment. The macroscopic 
effect can then be controlled by the mean of the movements of elements which we select. The result is the 



C omposition and Methodology 


333 


introduction of the notion of probability, which implies, in this particular case, combinatory calculus. Here 
in a few words, is the possible escape route from the "linear category" in musical thought. 

X enakis (1971) was reacting against serial i sm and al so aligning himself with aworldview then 
developing i n the physicsof quantum mechanics: "It is a matter hereof a philosophic and aesthetic 
concept ruled by thelawsof probability and by themathematical functionsthatformulatethatthe- 
ory, of a coherent concept in a new región of coherence." Xenakis's attempt to align music aes- 
thetics with a natural theory is nota new enterprise, of course, but dates back at leastto theearly 
Renaissance music theoristGioseffo Zarlino, who champíoned theview (asdid others) that music 
imitates nature (see section 9.17.5). 

Whilesomeof Xenakis's examples in his book FormalizedMusic describe methods for orga- 
nizing music for traditional instruments, elsewhere in this work he presents a moreabstract kind 
of sound organization. Heasserts, "AII sound isan integration of grains, of elementary sonic par¬ 
tí el es, of soni c quanta." X enakis was i nfI uenced by the semi nal work of D ennis G abor, who i n 1947 
observad an isomorphism between the Fourier series and a quantum analysisof sound (seevolume2, 
chapters 9 and 10). 

G iven the burden of computation requi red by a statistical approach to composition, it is not sur- 
prising that composers like Koenig and Xenakis turned to computers to help compose musical 
works. X enakis (1971) enthused, "With theaid of electronic computers the composer becomes a 
sort of pi I ot: he presses the buttons, i ntroduces coordi nates, and supervi ses the control s of a cosmi c 
vessel sai I i ng i n the space of sound, across soni c constel Iati ons and galaxi es that he coul d formerly 
glimpse only as a distant dream." These composers believed that statistical composing Systems 
using computers would allow them to shiftthei r attention from thesurfaceof the music to its inner 
structure. 

9.14 Probability 

Supposeaplayerwitheyesclosedstrikesapiano key atrandom. Whatis the chance that the struck 
key will be middle C? A standard piano has 88 keys, so to a first approximation, we'd expect the 
possi bi lity to be 1 out of 88. B ut because the w hi te keys are I arger than the bl ack keys, al I outeomes 
are not equally likely. To study this more closely, let's define some terms. 

■ Sample space The set of possible outeomes. 

■ Event Theoutcomeof arandom process, suchas a roll of the dice. 

■ Probabiüty The relative liklihood of an event, usually expressed as a real numberintherange 

0 < p < 1. 

■ Probability distribution A function, graph, or listing of the probabilities of the sample space 
that shows how probability is distributed among the possible events. 

■ Uniformdistribution Ifall eventsinasamplespaceareequally likely, theresultingdistribution 
issaidto be uniform. 



334 


C hapter 9 


■ Discrete distribution A distribution is discrete if the events in the sample space can be indi- 
vidually distinguished. Tossing coins or diceor picking a noteon a keyboard are examples of dis¬ 
crete di stributions. 

■ Continuous distribution A distribution is continuous if the events in the sample space cannot 
be individually distinguished. Temperature and frequency are examples of continuous distributions. 

■ Random variable L et s be the sampl e space consi sti ng of both si des of a coi n, w hi ch can be rep- 
resented as the set {H eads, Tai I s}. W hen a coi n is f I i pped, outcome R must be one of H eads or Tai I s. 

I n orderto constructthe probability distribution, we set a random variablex in turn to each possi ble 
outcome of the sampl e spaces and determi ne the probabi I i ty thatx i s equal to outcome R, as follows: 

f(x) = P(x = R) = | ,5, x= Heads 
1.5, x = Tails 

w hi ch i s read as "T he probabi I i ty di stri buti on functi on fof random vari abl ex i s defi ned as the prob¬ 
abi lity thatx equals outcome R, which is .5 if x is heads and .5 if xis tails." The random variable 
i ndexes the probabi I i ty di stri buti on functi on i n order to determi ne the val ue of the functi on at that 
Índex. 

We can use these terms to classify chance operations for further study. For example, tossing a 
coi n has a sampl e space consi sti ng of two outcomes, H eads or Tai I s, and the probabi I i ty i s 1/2 for 
either Heads or Tails if the coi n istrue, so its discrete probability distribution isuniform. Tossing 
a single die has six possi ble outcomes; if the die is true, each outcome has a probability of 1/6, so 
its discrete distribution is al so uniform. 

9.14.1 Discrete Distribution 

The sample space of one die hasd = 6 outcomes. Suppose we roll a whitedied^ and a black die 
d b . If we distinguish the event {d w = 1, d b = 2} from the event {d w = 2, d b = 1} and tally up the 
combination of all possible outcomes, we find that the sample space is the product: 
d w ■ d b = (6 ■ 6) = 36.Thestatesareenumeratedintable9.7. Eachnumberinthetablegrid isthe 

Table9.7 

Sample Space: Sumof Two Dice 

Black Die 


White Di 



C omposition and Methodology 


335 



Figure 9.18 

Probabil ity for the sum of dice. 


sum of the two dice. N ote that only one dice combinadon sums to 2, one sums to 12, and six com¬ 
binad ons yi el d 7. We'd ri ghtl y expect that the more combi nati ons sum to the same val ue, the more 
probable thoseoutcomes will be. So we'd expect a roll of two dice to be most likely to sum to 7 
and least likely to sum to either 2 or 12. The correspondí ng probabil ity distribution for the sum of 
the dice is shown in figure 9.18. 

I nterestingly, if we roll two dice and tally them separately, the probability distribution of all 
faces is uniform. But if w esum two dice, some combinations are more likely because some com- 
binations are more numerous than others, as shown in figure 9.18. 

A fundamental insight of probability theory is that if a random variablex has distribution f(x) 
and a random variable y has distribution f(y), then the distribution of the sum of the two random 
variables f(x + y) is the convolution of f(x) and f(y) (F. R. M oore, 1990). (To understand themath- 
ematical reason for this, see volume 2, chapter 4.) Figure 9.18 shows the convolution of two 
uniform distributions. 

9.14.2 Continuous Distribution 

Supposeaviolinistwith eyesclosed stopstheG string (which ispitched afourth below middleC) 
somewherealong its length. What i s the chance that the violinist stops the string atexactly mid¬ 
dleC, 261.626 H z? Because the string is continuous, there are in fact an infinite number offre- 
quency gradationsalong i ts length, justas there are an i nfinite number of pointsalong its length. 7 
So the I i kel i hoodthatthe violinist will stop the stringatany particular pitch is infinitesimal. Flow 
do we study continuous distributions if every event is infinitely improbable? We finesse 
this problem by assigning probabiIities to subsets of thesamplespace, effectively breaking the 
continuous space into discrete regions. We ask questions like, What is the probability that the 
violinist stops the string within a half step of middleC? A positive probability can beassigned 
to such an event. 

This example shows that probability only operates on discrete sample spaces, and if we must 
opérate on a continuous variable such as frequency or temperatura we must first break the con- 
tinuum into a discrete sample space. If we takethis región sizeto the infinitesimal limit, we are 
in effect operating on a discrete sample space of infinitesimal dimensions. Butthen weare back 
to the situation wherethe probability of each infinitesimal outcome is infinitely small. 



336 


C hapter 9 


9.14.3 Uniform Distribution 

Let’s return to the example of striking piano keys at random. Assume (incorrectly) that the out- 
comes are alI equally likely and that the probability of actually striking a key is 1. Then the prob- 
ability of striking a particular key (such as middle C) is the probability of striking any one key 
divided by the number of keys, or 1/88. 

If the events in a sample space are all equally likely, we can define the uniform probability dis¬ 
tribution function f(x) as 

f(x) = P(x = R) = ], (9.11) 

where Risa particular outcome (e.g., the struck key is middle C), s is the number of events i n the 
sample space, x is the random variable, and P (x = R) is the probabiI ity that x i s R. 

The number of keyss on an organ keyboard is 60, so striking middle C in a random attempt is 
somewhat more likely on this instrument. AIong the same Iines, the chance of strik¡ng any key in 
the mi ddl e octave of the pi ano i s 12/88. T he probabi I i ty of stri k¡ ng any pitch classC is 8/88 because 
there are eight C keys on the standard piano. 

9.14.4 Nonuniform Distributions 

It’s time to face up to the fact that more area on a piano keyboard is covered by white keys than 
black, so the I i kel i hood of stri king a black key at random is Iess than stri k¡ ng a wh¡te one. The rati o 
of the area occupi ed by al I the white keys k w to the total keyboard area k a expresses the probabi I ity 
of striking a white key: 

P(w) = k f, (9.12) 

wherep(w) is the probabi I ity of stri king a white key. There aren = 2 kindsof keys. Ifp(w) * 1/n, 
the probability distribution is not uniform. If p(w) > 1/n, striking awhitekey is more probable. 

By inspecting a piano keyboard, we can estímate that the ratio of white key area to total key area is 
p w = 3/4. By this analysis, the odds are that a white key would berandomly selected about75 percent 
ofthet¡meandablackkeytheremainderofthet¡me(figure9.19).Th¡splot¡saprobab/7/íyd/'str/but/on 
function because it expresses how probability is distributed over the sample space. 

L et s be the sampl e space of all white and black pi ano keys, w h¡ ch can be represented as the set 
{W hite, Black}. The outcome R must be one of W hite or Black. We construct the probabil ity dis¬ 
tribution function by setting a random variable x in turn to each possi ble outcome of the sample 
spaces and determining the probability thatx is equal to outcome R. Unlike in the coin example, 
this distribution is not uniform: 


f(x) = P(x = R) 


'.75, x = White 
.25, x = Black 



C omposition and Methodology 


337 


Figure 9.19 

Piano key probability distribution. 


In general, if the sample space s = { x 0 , x h ..x n }, the probabil ity thatsomeeventR isequal to 
a particularx in s isthefunction 

f(x) = P(x = R) (9.13) 

for any x. This is read as, "The probability that a random event R will result in an outcomex is 
defined by the function f." For example, f(Black) = P(R = Black) = 25 percent, and 
f(W hite) =P(R = W hite) = 75 percent. 

9.14.5 GeneratingOutcomesfrom Probability Distributions 

Probability distributions allow usto analyze random Systems like dice and coins, butwecan also 
use them to synthesize random numbers that are di stri buted i n probabi I ity accordi ng to our choos- 
ing. We can use such Systems to drive compositional processes to automatically generate music 
according to rules that wesupply. 

Say, for instance, we wish to use a random system to create a melody so that it favors lower 
pitches i n the scale. L et's I i mi t the sample space to one octave of the chromati c scal e. We can rep- 
resent this as a probability inequality: 

f(x) =P(R =C)>P(R =C#> >P(R = D) >■ ■■ >P(R = B). 

To be specific, supposewe wantto create a probability distribution function that is 12 times more 
likely to pick C than B, 11 timesmorelikely to pick C»than B, 10 timesmorelikely to pick D than 
B, and so on. The probability distribution function would look like the one in figure 9.20. 

Weknow whatwewant, but how do weget it? Sofar, theonly thingswehaveto work with are 
a random number generator, Randomo (seeappendix B, B.1.27) and a probability distribution 
function (figure 9.21). 

9.14.6 C umulative Distribution Function 

Let's rotateeachoftheweights infigure9.21 and then concaténate them. Their sumis78, so wedivide 
the length of each weight by 78 so that the weights sum to a length of 1.0 (figure 9.21). We haveeffec- 
tively divided up thex-axis i n the unit interval into 12 areasthat are proportional to the weights in the 



C hapter 9 



Figure 9.20 

Chromatic probability distribution. 


II 

_ 78 _ 

C 


n 

78 


irs 


io 9. i J 

78 78 78 78 78 

I D I D » I E I F I Zl 


—Morelikely Lesslikely—► 


1 1 1 11 

78 78 78 7878 

£ I G, I A |ad|b| 


Figure 9.21 

Chromatic probability distribution. 


original distribution. Nowwepickarandomnumberintheunitinterval withtheRandomo function, 
seewhich interval thenumberwould falI in, and then determinethechosen pitch.Theprobability that 
a particular ¡nterval will be chosen is proportional to the extent of its footprint on thex-axis. 

How can we represent this formally so that a Computer can do this? First, the statement 

RealLis-t# s= {12.0, 11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0}; 

defines the weights for each pitch, lowest to highest, left to right. N ote that the type of list f is 

RealList. 

N ext, we normal ize the weights so that they sum to 1.0 (see appendix A, A .3): 

^ f(n) = 1.0. 

N ormalizing is done in two steps: 

1. F i nd the sum of all weights: 

Real sum(RealList L){ 

Real s = 0.0; 

Fb.t (Integer^É* Length(L); i = i + 1){ 



C omposition and Methodology 


339 


GiventhedefinitionofReaiList f above, Print (sum(f)) prints 78. 

2. Divide each weightby sum(f) sothatthesum of the weights equals 1.0: 

RealList normalize(RealList L, Real s){ 

For (Integer i = 0; i < Length(L); i = i + 1){ 

L [ i ] '=; 'tL [ i ] / s; 

} 

RettíEíf (L) ; 


Given the definition of RealList f above, the statements: 

Realti^ií r “ iiormalize (f , sum(f) ) 1 

Print realToRational (r)); // realToRational is a built-in function 
prints { 12/78,. 11/78, 10/78, 9/78, 8/78, 7/78, 6/78, 5/78, 4/78, 3/78, 2/78, 1/78}. 

Afterthesetwosteps, r will look likefigure9.20 exceptthatall valuesarescaleddown by 78. (The 
built-in realToRational o function is described in appendix B, B.2.2.) 

Next, we create a function such that each step along thex-axis accumulates all the weights to 
its left with its own weight (figure 9.22). Thefirst column has a height of 12/78, the second of 
12/78 + 11/78, the next of 12/78 +11/78 + 10/78, and so on. This function is called a cumulative 
distribution function. 



Figure 9.22 

Cumulative distribution function. 




340 


C hapter 9 


If we Índex they-axis of figure 9.22 with a random valué in the unit interval, the corresponding 
x-axis valué wiII be one of the 12 pitches of the scale. Furthermore, the choice will more likely 
fall on a step that occupies a wider footprint on they-axis, corresponding in this case to the 
lower pitches of the scale, just as we wanted. We can create the cumulative distribution function 
in figure 9.22 as follows: 

RealList accumulate(RealList L){ 

For (Integeif...* =1; i < Length(L); i & $. * 1) { 

LEi] = - i]; 

} 

Return(L); 


Starting with thesecond elementin the list(indexed as 1), we replace this el ement with its original 
val ue pl us the val ue of the previ ous el ement. A s we proceed through the I i st, each I i st el ement will 
be equal to itself pl us al I previ ous elements. G i ven the preparation of the Realí#st r performed 
above, Print (accumulate (r) ); prints {0.15, 0.29, 0.42, 0.54, 0,64, 0.73, 0.81, 
0.87, 0.92, 0.96, 0.99, 1.0}. 

Wehave prepared the cumulative distribution function, and now we can accessit with a random 
valué to selecta pitch. Pick a number in the unit interval to bethenext noteof the melody: 

Real R = Random(); 

r will fall within the rangeof one of the 12 steps in figure 9.22 because both Randomo and the 
cumul ati ve di stri bution function exactlyspan the unit interval, Oto 1. Forexample, if r equals 0.1, 
then by inspection of f¡gure 9.22, we can see that r lies within thefirst step, which covers the i nter- 
val [0, 0.15], so the pitch that this valué of r selects is C. 

To automate this, we start at the top end of the cumulative distribution function and work 
down. As we go, we compare the valué of r to the current step size. We've gone one step too far 
when the valué of r exceeds the step size, so we return the previ ous step as the answer, and stop. 

Integer getlndex(IntegerList L, Real R){ 

Integer i; 

For (i = Length(L) - 1; i >= 0; i = i - 1){ 

If (R > L[i]){ 

Return(i + 1); 

} 

} 

} 

We can invoke getindex o as follows: 


Real R = Random(); 



C omposition and Methodology 


341 



C», Gtt, C,E,D,G,F,D», F,A#,C#, Dlt, D,F#,E,D,C*,D#,E,B,A,F#,C,C#,G 

Figure 9.23 

(Boring) musical example of weighted random valúes. 


Integer p = getlndex(f, R); // where f was defined previously 
Print(p); 

If r isO.l, then p prints o. Now let's bring all the pieces together. Here is a program that creates 
a melody of 25 pitches favoring pitches that are at the low end of thechromatic scale: 

RealList ¿ = {12.0, 11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.¡0.0, 1.ÜU 
StringList n = {"C",. "Cíf", ftjtjFj "D#", "E", "F", "F#", 

"G", "G#", "A", "A#", "B", "c"}; 

f = normalize(f, sum(f)); // replace f with its normalized form 
f = accumulate(f); // calcúlate cumulative distribution function 

StriilgList s; //a place to put the result 

For (5:ÜÍ?eqer if0; i <25; í',.'.- i +■ 1 ) { 

Ir.Legor p = getlndex(f, RandomO); 
s [i] = n [p] ; 

} 

Print(s); // print the melody 

Running this program wiII generate something Iike figure 9.23, depending upon the valúes pro- 
duced by RandomO . As wesee, lower pitches arefavored in approximately the proportions we 
specified.Thelongerthesamplemelody, themorelikely thepitchchoiceswouldconform onaver- 
age to the distribution function. 

U nfortunately, this melody is dreadfully dull, but it strictly obeys our requirements. This goes to 
show thatoneonly getsbackfrom an approachl ike this exactly whatonespecifies.A moregraceful 
melody might ri seto its el imax gradual ly, then fall at the end. The following example accomplishes 
this by selecting among a set of probability distributions at different points of the melody. 

..30$teger N = 13; //each list specifies 13 pitches 

RealList a = { //forcé choice to be pitch C 
%f 0, C, 0, o, o, o, o, o, o, o, o, o 

} ; 

RealList b = { //forcé C#, D, D#, E, or F 
0, 1, ^1, l,^ 0, C, 0* 0, 0, O, 0 

}; 






342 


C hapter 9 














C omposition and Methodology 


343 



C,D,Clt, D», E, D, DI, F, E, F*, A#, B,G», G, G, B, c, A», F», A, D, E, E,D(t, C 


I_I I_I I_I I_I 

a b c de b ^j|* 

Figure 9.24 

(Less boring) musical example of weighted random valúes. 

Running this program wiII generate something Iike figure 9.24, depending upon the valúes pro- 
duced by Random (). The distributions responsible for each section are shown in the figure. 

The musical example in figure 9.24 is certainly an improvement, but I doubt it would win any 
prizes. Certainly a composer of a melody takes its wholeshape into consideration during writing, 
but successive weighted random selections are completely independerá of the past and future. 
M any composers have used techniques like this to obtain freedom from predictable musical con- 
texts. B ut we must have a way to correlate past and future choices to the present before random 
choice techniques are of use in those musical styles that manipúlate Iistener expectation. The next 
section lays the foundations for a mathematics of expectation. 

9.15 I nformation T heory and the M athematics of E xpectation 

I nformati on i s a property of a message that i s transmi tted from a sender to a recei ver vi a a si gnal i ng 
system (seesection 6.1). Weknow intuitively what ¡nformation is, butwhen welook moredeeply, 
it has some unusual characteristics. For example, it is possible to quantify the amount of i nforma¬ 
ti on in a message. 

Supposeyou receivealetterfromafriend announcing her engagement. The letteronly contai ns 
¡nformation if you don't already know that she's engaged, that is, if you were uncertain about the 
contents of the letter. For example, if a friend had told you the news by phone before the letter 
arrived, the letter carries no ¡nformation; in fact, it is redundant. We see from this example that 
¡nformation and redundaney have a curious relation to uncertainty. 

Shannon and Weaver (1949) developed information theory to study the quantitative aspeets 
of ¡nformation. They were not concerned with the qualitative meaning or valué of ¡nformation 
but strictly focused on how much ¡nformation was communicated by different kinds of mes- 
sages. Considertheproblem, for instance, of music dictation. Traditionally, music students are 
taught music dictation by a professor who plays music that the students must learn how to write 
down. Suppose we are students who are required to take a class that the syllabus calis M usic 
Dictation from HelI 101. Our sharpened #2 pencils are at the ready, poised over blank music 
paper. 

On thefirstday, the professor says, "Class, I am going to play one note, middle C, over and 
over again for the next hour. Besureto write them all down correctly." Flethengoesto the piano 
and plays C, C, C, C, C, C,.... Any time someone coughs or there is a loud disturbance, we 



344 


C hapter 9 


can'thearthe piano, so wewriteC, C, ?, C, C, ?, C,.... No matter, weknow thatthemissing 
notes are C. 

On the second day, only the students who need this class to gradúate show up. The professor says, 
"Class,I amgoingtoplaytheC majorscaleforthenexthour. Besuretowritethemall downcorrectly." 

So wewriteC, D,E,F, G, A, B, C, D, E.Occasionally, wenod off, missing anoteortwo, so we 

w ri te C, D, ?, ?, G,.... N o matter, we know exactly w hat the mi ssi ng notes are. A s before, the message 
isalmosttotally redundant, butthe redundancy allows usto recoverfrom any transmission errors. 

On thethird day, a miserablehandful of studentsstraggles into the room. "Class, I am going to 
pl ay each of the 88 keys on the pi ano i n random order, butyou'll beglad to know that I won't repeat 
any key until l'veplayedevery oneofthem. Besuretowritethemall downcorrectly." Wescribble 

furiously: C3, G»4, B^5, F7, G,l,_Itseems impossibly difficultatfirst. Any disturbancein the 

room meanswe'veirretrievably lost that note becausewecan'tpredictwhatitwill be. Butwedis- 
cover that it becomes easier as we go along because we know the professor won’t repeat any note 
until he's played all theothers. By the ti me he has played 78 notes, if we missa note it's not too 
bad because there are only ten possi ble notes leftthat hecould play. And when he has played 87 
notes, the eighty-eighth note is a certainty; we don’t even need to hear it to write it correctly. The 
i nformation contení of each subsequent note decl ines while its redundancy i ncreases because each 
new note played narrows the choices of what notes can be played subsequently. 

On the fourth day, you and I are the only two students desperate enough to show up. The pro¬ 
fessor says, "Class, I am going to play each of the88 keyson the piano in random order, and I may 
repeat keys any time I like. Be sureto writethem all down correctly." The only source of ¡nfor¬ 
mation about what note wi II be played next is the note itself. Information i n each notéis very high, 
redundancy is very low. But we hear patterns occasionally as we go along. We write, "Repeated 
Bi,5fourtimesinarow" or "Played melody of MoonlightSonata" asashorthand.Theseshorthands 
allow usto recover i nformation and squeezeout redundancy in what we write, because otherw i se 
we'd have to enter all the notes or write out the melody of the MoonlightSonata. (This kind of 
¡nformation recovery, by theway, i ssi mi lar to onestage in the process used by M P3 encoders to 
compress musical sound.) 

O n the fifth day, the professor does not come, but there's a note on the pi ano that says," G o dow n 
to the beach and write down every note you hear in the ocean's waves. Be sureto writethem all 
down correctly." We go to the beach. Overwhelmed by the multitudeof frequencies in each splash 
of the waves, wecannot write down anything. 

Throughoutthe week, the professor played patternsthat wentfrom greatcertai nty togreatuncer- 
tainty. We became aware that the amount of ¡nformation carried by what the professor actually 
played was a function of our uncertainty about what could be played. The more we knew about 
what was coming, the less ¡nformation was conveyed by what was communicated. Every con- 
strai nt the professor i mposed on his freedom of choice resulted i n a decrease of i nformation in the 
music itself. We observed the valué of redundancy to help prevent ¡nformation loss when noise 
disrupts the Communications channel. Wealso learned that we have emotional reactions to dif- 
ferent degrees of ¡nformation and redundancy. 



C omposition and Methodology 


345 


9.15.1 E ntropy and Redundancy 

Shannon and Weaver (1949) formalizad their ¡deas about Information using the concept of entropy, 
which they adapted to their purposes from the physical Sciences. In chemistry, entropy is a measure 
ofthewaysinwhichtheenergy of a molecular system isdistributed among themotionsof its particles, 
its thermodynamic probability. In Information theory, entropy is a measure of the ways in which the 
Information ofasignaling system isdistributed among its Communications. A highly entropical micro- 
particle distributes its energy widely among its possible motions. A highly entropical signal requires 
a large number of independentfacts in orderto fully communicate it. In termsof the M usic Dictation 
from H ell example, days 1 and 2 were low-entropy days and the rest were high-entropy days. 

9.15.2 Surprisal 

O n day 1 of M usi c D i ctati on from H el I, the probabi I ity that the next note woul d be the same as the 
precedí ng note was 1.0, becausetherewas no unexpectednessor surprisal about what note the pro- 
fessor would play. As the probability of an event decreases from 1.0 toward 0, the surprisal goes 
from zero to infinity. B ut what is the exact trajectory of this relation? 

Recal I day 4 of M usic Dictation from H ell. As the professorplays notes atrandom over the entire 
range of the keyboard, suppose you and your friends devise a game to pass the time, betting on 
which key the professor will play next. You wager that the next note will be below the midpoint 
of the keyboard. 8 The probability is 44/88 = 1/2 that you will be right. If the professor's next note 
is as you predicted, you are pleasantly surprised, and your friends mark down 1; otherwise they 
mark down 0. Since there are only two possible outcomes, the amount of surprisal requires one 
binary digit, called one bit, to represent (see volume2, chapter 1). 

Suppose you take a bigger risk and wager that the next note wil I be in the bottom quarter of the 
keyboard.Theprobability is22/88 = 1/4. Sinceyourrisk hasdoubled,you’d betwiceassurprised 
i n the event you guessed correctly. You'd need two bits to represent the amount of surprisal. With 
two bits, you can represent a magnitude of 4. 

Wagering that the next note is in the bottom eighth has probability 11/88 = 1/8, requiring three 
bits to represent the amount of surpri sal because you can represent a magnitude of 8 wi th three bits. 
Probability of 1/16 requires four bits of surprisal; probability of 1/32 requires five bits. These 
examples can be expressed as follows: 

P = Ys' Probability and Surprisal (9.14) 

wherep is probability and sis surprisal. Equation (9.14) finds probability given surprisal. To find 
surprisal given probability, we sol ve (9.14) fors: 

s = log2^ = -log 2 p = -j^, 

where Inx isthe natural logarithm to the basee. 


(9.15) 



346 


C hapter 9 


For example, the probability p of predicting the next individual key the professor plays is 1/88, 
and its corresponding surpisal is 6.46. Oneadvantageof surprisal is that where probabilities mul- 
tiply, surprisals merely add. For example, the probabil ity of guessing two individual keys in suc- 
cession is 1/88 2 = 1/7744, but the surprisal is merely 6.46 + 6.46 = 12.92. 

We can extend (9.15) to represent the surprisal for every key. Let each key be labeled x¡, i = 1, 
2,..., M, where M = 88 is the number of keys. Let the probability that the /'th key is pressed be 
P¡. Then the surprisal of the /'th key’s being played can bedefined as 

S/ = -log 2 P¡. Surprisal (9.16) 

The negation in (9.16) reminds us that the surprisal of an event increases as its probability 
decreases. 

Suppose the professor playsatotal of N notes. I f the /th key is played N¡ times, then theaverage 
surprisal of all pitches in the melody would be 

H(X) = X l íl s ., Average Surprisal (9.17) 

/tí N ' 

where X represents all possi ble keys on the piano keyboard. 

As the total number of notes N increases to infinity, the ratio N¡IN tends to P¡. By combining 
this with thedefinition for s, given in (9.16), we have 

H(X) = -K£P f log 2 P¡. Uncertainty (9.18) 

H (X) isa measure of the uncertainty of the System, and /C isa positive constant of proporti on¬ 
al ity. By suitable adjustment of /C, wemay choose any base for the logarithm. Use of base 2 log- 
arithmsisfairly standard, but in general Shannon and Weaverdefined theinformation in a System 
X as 


/(X) = -/C¿P í lnP í . 


Information (Entropy) (9.19) 


They noted a striking resemblance of this equation to the equation relating thermodynamic prob¬ 
ability to entropy: 

M 

H = l/l/jln W¡, Thermodynamic Probability (Entropy) (9.20) 


where 1/1/¡ is the thermodynamic probability of each State, k is Boltzmann’s constant, equal to 
1.3807 x 10- 23 J K- 1 , and H is the resultant entropy. They then related entropy to information by 
the simple expedíent of the ratio k/K , 9 



C omposition and Methodology 


347 


When log 2 x is used, the unit of entropy is called a bit (though this definition is more flexible 
than the bits in a Computer memory). W hen Inx is used, the unit is cal led a nat. For log 10 x the unit 
i s cal led a hartley. 10 

Summarizing equations (9.19) and (9.20), Shannon (1948) makesthefollowing points: 

■ Entropy H of a Communications channel will bezero "if and only if all theP, but one arezero, 
this one having thevalue unity. Thus, only when we are certainof the outcomedoesH vanish. Oth- 
erwiseH is positive." This case corresponds to day 1 of M usic Dictation from Hell. 

Only absolute certainty banishes entropy absolutely. 

• "For a given n,H isa máximum and equal to logn when all theP, areequal (i.e., 1 In). This is 
also intuitively the most uncertain situation." This case corresponds to day 4 of M usic Dictation 
from FielI. 

The most uncertain situation has the máximum entropy. 

. "Any change towards the equalization of the probabilities P v P 2 , ■ ■ ■, P n increases H." Con- 
versely, any changethatmakes probabilities lessequal reduces H. Forexample, on day 3 of M usic 
Dictation from Hell, H was gradually reduced as notes thatwere played were removed from the 
pool of possi ble notes. 

9.15.3 Department of Redundancy Department 

We can use the definition of máximum entropy to show the relation of entropy to redundancy. 
Redundancy relates the actual entropy H(X) to itstheoretical máximum, logW, asfollows: 

R(X) = i - ^. Redundancy (9.21) 

logW 

Because redundancy is normal ized for the length of the communicatión, it is actual ly more useful 
than entropy as a way to compare sequences. 

Information theory presents us with the somewhat counterintuitive outcome that the greatest 
amount of i nformati on i s associ ated w ith the greatest degree of uncertai nty. B ut i nformation i s not 
the same thing as knowledge. 

Information relates to the breadth of whatcould be communicated. Knowledge is a distillation 
of the regularity and order arising from a communication. 

9.16 M usic, I nformation and E xpectation 

Ordinarily, music, I i ke most Systems, contai nssome entropy and some redundancy. In M usic Dic- 
tation from Flell, wesaw thattheextremesof entropy and redundancy kill ourinterest. Ifthedegree 
of redundancy is too high (as on days 1 and 2), the music is too predictable, and the I istener even¬ 
tual ly gets bored and stops listening. If the degree of entropy is too high (as on days 4 and 5), the 



C hapter 9 


music is too unpredictable, and the listener eventually gets frustrated and stops listening. In 
between is where music happens: when entropy and redundancy sustain afluid, dynamic balance, 
there is enough regularity to orient the listener in the music but al so enough novelty to preserve 
interest. This suggests that, in general, 

C omposing is about the manipulation of interest; affect; and attention. 

This shouldn't be too surprising: after all, the human neocortex isa very refined organ of expec- 
tation. A fundamental jobof the neocortex isanticipating what may happen next. Oneof theways 
weentertain ourselves is by exercising this faculty in play. 

Susan Langer (1953) characterized music as a kind of emotional algebra: "M usic conveys gen¬ 
eral forms of feelings, related to specific ones as algébrale expressions are related to arithmetic 
[expressions]." 

Leonard M eyer (1956) proposed an "affect theory of music," writing "Emotion or affect is 
aroused when a tendeney to respond i s arrested or i nhi bited.... W hat a musi cal sti mui us or a series 
of stimuli indicates... [is] not extramusical conceptsand objeets but other musical events which 
areaboutto happen.... Embodied musical meaning is, in short, a productofexpectation" [italics 
added]. M eyer has precisely defined musical meaning, and it bears repeating: 

Expectation is a prediction based on currentand pastexperiences. M usical meaning is a fuñe- 
tion of expectation. 

Aristoxenus said much the same when hewrote, 

M usical cognition impliesthesimultaneousrecognitionof apermanent andachangeableelement.. .forthe 
apprehension of music depends upon those two faculties, sense perception and memory; for we must perceive 
thesound that is present, and remember that which is past. In no other way can wefollow the phenomenon 
of music . 11 

If audition and memory are the engines that drive expectation in music, expectation itself is 
the beginning and end of music. Freyd (1987) developed what she calis "representadonal 
momentum" to characterize expectation of movement: "The perceptual System isgeared to per¬ 
ceive transitions in real time" (428). In other words, the brain constantly anticipatesthefuture. 
This must be so; how el se would we catch a baseball, drive a car, or comprehend the rise and 
fall of amelody? Freyd (1993) writes, "Justas time isa dimensión in theexternal world, insep¬ 
arable from the other physical dimensions, so might time be a dimensión in the represented 
world [in the mind]" (105). 

M any traditional compositional practices are aimed at securing and maintaining the listener’s 
interestthrough expectation. Consider thefollowing musical motive: 


If I then play 



C omposition and Methodology 


349 


you becomeawarethat I am sequencing a motive rising by diatonic steps, and you may expect I 
will repeat it. If I meetyour expectation by extending the sequence 


representad onal momentum increasesand entropy decreases. I risk losing yourattention because 
you now recognizethe pattern, and sincethereis hardly any new Information in it, you may start 
to lose interest. If instead of the previous motive, I play as a final motive 



I have frustrated your representad onal momentum by shifting your attention from the horizontal 
melodic sequence to the vertical harmonic resolution. Surprise renews interest. There is also the 
sahsfachon of arriving at a complete musical thought by cadencing. 

9.16.1 TheGolden M ean 

H ere i s the enti re phrase j ust descri bed: 


1 2 Cadenee 



Notice that the sequence's momentum is broken about two thirds of the way through by the 
cadenee. It is very common for musical patterns to veer off in a new direction near ratios of the 
golden mean. This proportionality appears in musical structures of all kinds and all levelsof com- 
positional scope, ranging from motivic fragments to eyeles of works. For example, the boundary 
between theexposition and development section in many M ozartsonatas begins in the vicinity of 
(and sometimesevenexactly on) the measure that divides themovement by the golden mean(Putz 
1995; Kay 1996). Similar arrangements appear in the works of Beethoven, Webern, and many 
other composers (Novden 1964). Did thesecomposers intentionally structuretheir music to have 
these proportions? Weknow that M ozartwasfascinated by mathemahes, butthere's little hard evi- 
dence one way or the other. O n the other hand, i n h¡ s work M usic for Strings, Celeste and Percus- 
sion, Béla Bartók used the golden mean so accurately, so often, and at so many structural levels 
simultaneously that it iseasy to assume hedid so ¡ntentionally (Lowman 1971). 

But proportional analysis of music only goes so far. M usic more resembles objeets shaped by 
natural forcesthan objeets shaped by axiom. For example, Putz (1995) found that large structures 
in M ozart's sonatas carne statisti cali y much closerto the golden mean than smaller structures. Fie 
attributed this—correctly I believe— to thetendeney of natural proportional structures, such as 




350 


C hapter 9 


segmentsizes of shells and branching patterns in plants, to become increasingly approximate at 
the extremes of scale. 

Another reason for the limited success of proportional analysis of music is that a high degree 
of strict proportional i ty on many levelsof scale is highly redundant, and this is inconsistent with 
the compositional exploitation of expectation and surprise. Structural predictability can only be 
useful to a composer upto a point because music isdesigned to gain and maintain interest, and this 
requires a certain degree of structural ambiguity. Consequently, materiais may be ordered, com- 
bined, disordered, and recombined in a manner that defies easy analysis. M eyer (1956) writes, 

Weak, ambiguous shapes may perform a valuable and vital function ... for the lack of distinct and tangible 
shapes and of well-articulated modesof progression iscapableof arousing powerful desi res for, and expec- 
tations of, clarification and improvement_some of the greatest music is great precisely because the com¬ 

poser has not feared to let his music tremble on the brink of chaos, thus inspiring the listener's awe, 
apprehension and anxiety, and, atthe same time, exciting hisemotions and his intellect. 

Information theory and its relation to expectation and surprise show up even at metalevels of 
the composition process. Wherever there is a belief, there is an opportunity for its deconstruc- 
tion, with all the same consequencesfor expectation and surprise. For example, it seemsjohn 
Cage's primary aim was notto mai ntai n an audience's ¡nterest. Rather, he wanted to allow nat¬ 
ural processes to manifest directly in his music, in part, I suppose, because this would decon- 
structcompositional methodologybasedon i nterest and expectation. W hereSchoenberg and his 
school sought to erase the expectation of tonal harmony, Cage and others sought to erase the 
expectation of expectation itself. (Note that this still requires a sense of expectation.) Thus, 
deconstructionism can be seen as theplay of information and expectation in the realm of belief 
Systems. 

9.17 Form in Unpredictability 

M usic is like afield, bordered on one si de by order and regularity and on the other by surprise and 
irregularity, and the most effective musical domains lie in the middle ground between these bor- 
ders. Redundant el ements communi cate a sense of order that is embodied, for example, in the reg¬ 
ularices between the various parts of a musical composition. Taste is reflected in the entropical 
el ements, and style i s reveal ed i n the pattern of trade-offs made by the composer between order and 
taste. Ifweappreciate the sense of order, taste, and style in music, weappreciateth e intelligence 
that informs the composer’s work. 

Theefforts i n thischapterto generatecompositions by rulehaveso farshownno particular musi¬ 
cal intelligence. Because all valúes chosen by the Randomo function are strictly independent, 
themusic created directly from itisunsatisfying; itlackstheglue— redundancy— that binds music 
together. But there are mathemati cal forms, cal led fractals, that reveal adeep innerstructure, very 
similarto thecomplexinnerstructuresof music, thatcombinevarying degreesof predictability and 
unpredictability in one contour. 



C omposition and Methodology 


351 


9.17.1 Self-Similarity 

ConsidertheWeierstrassfunction shown in figure9.25. Likeocean waves, itisshaped from point 
to pointwith a balance of predictability and unpredi ctabi I i ty. T hi s bal anee extends across diff erent 
levels of magnifi catión: the shape of the smaller parts resembles the shape of the larger parts, and 
vice versa, demonstrad ng self-similarity atvarious scales. Forexample,thecontoursinsidethetwo 
boxed sections of the curve in figure 9.25 are similar. This calis to mind the proverb "The more 
thingschange, the morethey remam the same." A structure is self-si mi lar if, when magnifi ed, its 
structure remai ns si mi I ar to the ori gi nal seal e. B ut w hat defi nes si mi I ari ty i n thi s case? 

Let's examine how energy is distri buted i n the partíais of the Weierstrass function. Figure 9.26a 
shows the power spectrum of this function on a linear scale (see volume 2, chapter 3). A power 
spectrum is basically a meansto observe wherethere is energy in a signal. M ost of the energy in 
thissignal isnearO Hz, and energy dropsoff quickly with increasing frequeney, but it's a littlehard 
to see what's realIy happening. A clearer picture emerges from figure 9.26b, which depiets the 
same spectrum, butwith log frequeney and log amplitudeshownonthex- andy-axes, respectively. 
Viewing the power spectrum as a log-log plot reveáis the essential detaiI of the spectral plot. The 
Iines through the peaks of both plots show the ratio of 1/f, where f is frequeney. The tips of the 



Figure 9.25 

Weierstrass function. 



Figure 9.26 

Weierstrass function power spectrum. 



352 


C hapter 9 


spectral components seem to track thi s 1/f I i ne. We say that the Wei erstrass functi on has a spectral 
tendency of 1/f, meaning thatthei ntensi ty of frequency nf has 1 In the intensity of frequency f. This 
corresponds to a rol l-off of hi gh-frequency energy at the rate of about -3 dB SI L per octave. So how 
does this characterize si mi larity? And what does thisform of similarity haveto do with music? 

9.17.2 Fractal Geometry 

Theordinary materials of Euclidian geometry, such as I ines, surfaces, and vol umes, areorganized 
by thei r dimensión, which can be i ntuitively defi ned as the number of numbers needed to uni quely 
I ocate a poi nt i n space. To I ocate a poi nt some di stance along a I i ne or curve, one number suffices, 
so lines and curves areone-dimensional. One number also serves to measure the di stance from a 
point on thecircumference of a círcle, orto measure al ong theedgeof an object. 

For a point on a plañe, two numbers are required, so a plañe istwo-dimensional. We can orga- 
nize the two numbers i n a vari ety of ways. Typi cal ly, we establ i sh an orthogonal coordi nate sys- 
tem with Ii near di mensions and express points i n Cartesian coordinates such as [x, y]. B ut wecould 
considerothernon-Euclidiantwo-di mensional "spaces," such as telephone numbers that consist 
of a three-digit exchange number followed by a four-digit line number. We could also 
consider the nonlinear two-dimensional surface of a M obius strip (figure 9.27) or a deformed 
surface such as a balloon. Three numbers are required to describe a point in 3-D space, and so 
forth. 

The characteristic size of objects in Euclidian spaces changes in a regular way as the extent of 
their linear dimensions change. For example, if a line is doubled in length, its characteristic size 
also doubles. Doubling the length of a square's si de multiplies its area by 4. Doubling the length 
of a cube's side multiplies its volume by 8. Abstraedng from these examples, if D is dimensión 



Figure 9.27 

M. C. Escher, M obius Strip II, 1963. 


C omposition and Methodology 


353 


and L isascaling coefficient, then thecharacteristicsizesof anobject isgiven by s = L D . Solvíng 
forD, we have 

D = 7^7 . Dimensión (9.22) 

InL 

EuclidiangeometrycoversthecaseswhereD = {1, 2, 3,...}, D e /. However,therearestruc- 
tures that look li ke curves, such as the one ¡n figure 9.25, but that don't behave I ike curves because 
positionalongthecurvecan'tbedescribedasaone-dimensional offsetfrom someotherpoint. Such 
shapes do not y i el d i nteger val ues for D and do not obey the scal i ng rul e for E ucl i di an geometri es. 
These shapes are not mere pathological 12 curiosities. They reflect the structures of coastlines, the 
branchi ng of plants and blood vessel s i n the I ungs, theannual f I ood ti des of the N i I e, and many other 
natural phenomena, including music. To accommodate such geometrical anomalies, mathemati- 
cians have had to devise more nuanced definitions of dimensión, allowing for fractional dimen- 
sions. Objects with fractional dimensión were nicknamed fractals by Benoit M andelbrot (1977). 

TheKoch Snowflake A simplefractal example is the Koch snowflake. To generatethis shape, 
begin with a triangle, such as an equilateral triangle with sides of length 1. Then, for each side, 
dividethelength by 3, and build another triangle with its baseupon every middlesegmentand its 
apex pointing outward. Last, discard the base segment, leaving only the sides. The first four 
approximations are shown in figure 9.28. 

The shape becomes ever more detai led, and i n the I i mit as the number of iterati ons goes to i nfi n- 
ity, the distance between any two points along the curve becomes infinite, even though the area 
bounded by the curve remainsfinite. Therefore, in thelimit, itisimpossibleto determine a length 
along the boundary. The structure shows similarities at all levels of magnification, so it is 
self-si mi lar, which makessense, considering how it is constructed. 

The Koch snowflake and the Weierstrass function are examples of deterministic fractals 
because they aredefined by an algorithm or mathematical formula. As shown in figure 9.28, the 
regularities of deterministic fractals are sel f-similar. There are al so nondeterministic fractals, or 
random fractals, that more closely resemble the natural shapes of coastl i nes, mountain ranges, and 
natural musical signáis. The irregularities of random fractals are statistically self-si mi I ar. 
Although deterministic fractals are infinitely self-similar, the self-similarity of natural fractal 
shapes tends to break down at very large and very small scal es. 



Figure 9.28 

Koch snowflake. 



354 


C hapter 9 


a. Scottjoplin piano rags 

b. Classical radio 

c. Rock radio 



Figure 9.29 

1/fspectra. (Vossand Clarke 1975; 1978.) 

9.17.3 Self-Similarity in M usic 

Richard Voss and John Clarke, when they were gradúate students at the University of California 
in Berkeley, observed that a great deal of music, when examined over a long enough time span, 
appeared to havespectral tendency of 1/f v , 1 < v <2. They observed this by connecting a spec- 
trumanalyzertotheoutputof anAM radio.Thefrequency componentsrevealedself-similarmusi¬ 
cal structure, especially for frequencies below 1 Hz. Since frequencies below about 50 Hz 
correspond to rhythmic and structural elements in music, they reasoned that the compositional 
structure of music—sections, phrases, motives, and note durations— exhibit a 1/f spectral ten¬ 
dency, revealing an even balance between entropy and redundancy. 

F igure 9.29 shows someof thei r results. The ScottJ opl i n piano ragswereaveragedoveran entire 
recording, perhaps an hour of music. Voss and Clarke (1974; 1978) attributed the high variation 
in this curve between 1 and 10 Hz to strongly characteristic rhythmic elements injoplin's music. 
The rock music station recorded over a 24-hour period shows a spectral bump at about an hour's 
duration, perhaps corresponding to station breaks. Wondering how universally this result would 
hold, Voss and Clarke repeated theexperimentwith recorded music from awidevariety of musical 
ages, locations, and styles. All their subjects showed 1/f spectral tendencies, especial ly at very 
low frequencies. They believed they'dfound experimental evidence that music favors this spectral 



C omposition and Methodology 


tendency universally. Although the effect may not be universal, approximate self-similarity of 
musical structures has been widely demonstrated. 

9.17.4 G enerating Scaling Signáis 

The work of Voss and Clarke has evoked a great deal of interest and some controversy. M usicol- 
ogistshavesurveyed agreatdeal of music for fractal elements, and composershaveexperimented 
with fractal designs. To do either requires a way to observe and generate fractals. Following are 
sometechniquesto generate fractal signáis. 

Generating Deterministic Fractal Signáis Although ¡t appears to be random, the generating 
equation for the Weierstrass function isstrictly deterministic, likethe Koch snowflake. In fact, it 
is just a variation on Fourier synthesis, summing a number of sinusoids at various harmonios (see 
volume 2, chapter 9). U nlike Fourier synthesis, the harmonios and their correspondíng amplitudes 
are in an exponential rather than a linear sequence: 




sin(7tr“ t). 


Weierstrass Function (9.23) 


F igure 9.25 shows the Weierstrass function for r = 0.5, H = 1.0, and N = 32. Ifwecould hearthe 
waveform in figure 9.26 itwould sound likea rich pipeorgan tone. 

In equation (9.24) the parameter r, called the lacunarity, Controls the texture of the spectrum. 
It can usefully vary over the rangeO < r < 1. H is called the H urst exponent, or more intuitively, 
the self-similarity parameter or long-range correlation parameter. It has the range 0 < H < 1 and 
Controls the spectral tendency because it determines the amplitudes of the harmonic sequence. H 
i s reí ated to the fracti onal dimensionD =2 -H (Falconer 1990). As/-/ goestoO, high frequencies 
i n the spectrum become stronger unti I, when H = 0, the spectrum no longer drops off i n ampl itude 
with higher frequencies. The Weierstrass function varíes in dimensionality between 1-D and 2-D 
as H varíes. Near H =0 the curve is so dense that it seems to fill up the whole plañe and so has 
dimensionality near 2-D. 

Brownian Noiseand theRandom Walk We can relate the independent valúes of a uniform 
random number generator in such a way thatthey show interdependenceand correlation across 
time and so achieve self-similarity. Asa model of this process, consider the random walk of a 
drunk person who repeatedly stands up and stumbles off in an independent random direction, 
falls down, and starts off again and again. Clearly, where the drunk was a moment ago deter¬ 
mines the possi ble places he will fall next, so there is a sense of history, albeit a quixotic one, 
to the process. If U s isa uniform real random sampleandx n is the current point, then Brownian 
noiseisdefined as 

=l/ s +x n _ 1 . Brownian Noise (9.24) 

Because subsequent points depend upon current and previous points, this isa recursive process. 



356 


C hapter 9 



Brownian motion was first identified byjan Ingenhousz in 1785, but it was named for Robert 
Brown, who rediscovered it in 1827 whilewatching the dance of pollen grainsin adropof water 
under a microscope. Albert Einstein identified this in 1905 as the effect of molecules of water, 
excited by heat, striking the pollen grains. Brownian motion describes the movement of micro- 
particles in liquids and gases. Their movement is subjectto Newton'sfirst law of motion, so their 
i nerti a wouI d make them want to travel i n a straight I i ne, but they can move onIy so far on average 
(the mean free path) without bumping into other microparticles, which sends them off in new 
directions. (Calculus alert!) A function is integrated by adding each subsequentpoint on thefunc- 
tion to its previ ous poi nt. B row ni an moti on can be vi ewed as the i ntegral of uniform random noi se. 
Figure 9.30 shows an exampleof Brownian motion in two dimensions. 

Because this movement depends not on an absolute position but rather on its previous relative 
position, the rangeof x istheoretical ly without bounds. For example, if L/ s happensto favor positive 
outcomes in thelong run,x„ could grow toward positive infinity. Becausecomputershavelimited 
precisión, an adj ustmentmust usual ly be madeto keep the random walk withi n computable I imits. 
Here isa simple Brownian numbergenerator(F. R. M oore 1990): 

Real brownian(Real x. Real w. Real B){ 

Real R; 

Do { 

R = x + Random( -w, w ); 

} While ( R > B Or R < -B ); 

Return R; 

} 

Parameterx iseither theinitial valueoftherandomwalkorthevaluelastcalculatedbybrownian o. 
Parameter w is cal led the wi ndow parameter because it determi nes the maximum amount by which 
the valué of x can change at one time. Parameter b is the bounds, limiting the Brownian motion 
to within its range. This method departs from strict Brownian motion by retrying the random 
choice until the new valué lies within this range. 




C omposition and Methodology 


357 




Figure 9.31 

Brownian noise and its power spectrum. 


Wecall the brownian o methodeachtimewewantanew Brownian number, passing iteither 
an initial valué or the valué of its previousoutput. Forexample, thefollowing code generated the 
function shown in figure 9.30. 

Real x = O.C; 

Real y = 0.0; 

Fpr (llpegei %.*&'. 0 ; i < 1000 ^ iy'4- 1) { 

x = brownian(x, 0.5, 0.5); 
y = brownian(y, 0.5, 0.5); 

PlotP«a#t (x, y); // plot a point on a graph at 10fí.át;i#íi [x, y] 


A Brownian noise signal and its power spectrum on alog-log plot are shown i n figure 9.31. The 
straight Une in the figure traces the contour of 1 lf 2 for reference. 

Fractional Brownian Motion TheprecedingBrowniannumbergeneratorproducesahighdegree 
of local similarity because subsequent points are constrained to remain relatively cióse to previous 
points. B ut because the random i ncrement at each step i s i ndependent, Brownian motiontypically only 
shows self-similarity in a región of its spectrum, so its fractal quality degenerates with scaling. 

Fractional Brownian motion (fBm) is like Brownian motion, butthe increments areno longer 
¡ndependent. Instead,justas low-frequency ocean wavesextend their influenceover many cycles 
of higher-frequency waves, in fBm, local rapidly fluctuating valúes are influenced by broader, 
slower-moving valúes extendíng proportionately over the entire spectrum. As fBm is magnified, 
it retains its statistically self-similar shape, and so itisfractal regardlessof magnification. 

Think of itthisway. If wehad an ideal tape recorderthataccurately recorded all frequencies, 
and we gradually increased thespeed of a tape recording of Brownian noise, thecharacter of the 
noise would change (from a relatively low-frequency "whoosh" to a higher-frequency "whish"). 
B uta recording of fBm noise will sound the same regardless of playback speed. All speedssound 
the same because both the signal and the spectrum are self-similar at all levelsof scale. A number 
of methods can be used to generate fB m noises. 



C hapter 9 


Randomized WeierstrassMethod One way to generatefBm noise is to add a random phase 
term to the Weierstrass function: 

« kH k 

w(t) = ]Trsin(jtr t + <D(x)), (9.25) 

k=0 

where O(x) = nil s r kH x. In the function O, the parameter x allows us to set the strength of the 
effect. The strength of phase randomization i s scal ed as frequency ri ses so that the overal I spectrum 
remains approximately 1/f, depending upon the choice of parameters. 

Voss’sMethod M artin Gardner (1978) reported a fractal noise generator attributed to Voss. A 
set of random variablesx*. are summed on each samplen, and the result is output. The random vari¬ 
ables are updated atdifferent rates. If ((n)) 2k = 0, then the/cth variable isassigned a new random 
number U s . The Índex k rangesfromOtoN -1. Sox 0 is randomized every sample,x 1 is randomized 
every other sample, x 2 is randomized every fourth sample, and so on, until finallyxj,.! is only 
randomized every 2 N ~ 1 samples. Wecan express the formula as follows: 

f(n) = y {(((/i)) k = 0), (X k <-U s el se x k )} (9.26) 

k =0 

where U s ia source of random numbers. Wecan codethis method as follows: 

Real VossFracRand ( Intege): 4, RealList L ) { 

Real sum = 0.0; 

Integer N = Length( L ); 

For(Integer k=0;k<N;k=k+l) { 

If (Mod(n, Pow(2, k)) == 0) { 

L[k] = Random(-1.0, 1.0); 

} 

SW = sum + L [ k p 

} 

Return(sum); 

The following creates and prints a list of 128 fractal noise samples over four octaves: 

RealList L = {RandomO, Randomf) , Random(), Random()}; 

RealList R; 

For (Integer n=0;n<128;n=n+l) { 

R[n] = VossFracRand(n, L)) ; 


Print (R); 



C omposition and Methodology 








Figure 9.32 

Voss's fractal generator. 

Figure 9.32 shows how this noise is constructed by this method. Each function changes at a rate 
twice as fastas the previ ous function, and thefunctions aresummed. Random valúes in the rapidly 
changing functionshaveonly local influence, whereas valúes in theslowly changing functions extend 
their influence over many samples of the summed result, giving the result a fractal characteristic. 

Spectral Filtering Method Wecan generatenoisewith an arbitrary spectral tendency by scal- 
ing the power spectrum of uniform noise. In fact, completely arbitrary noise functions can be 
obtained this way, fractal andotherwise.Themethod isto compute the F ourier transí orm of a noise 
signal, scale its power spectrum as we like, then retransform with the inverse Fourier transform 
(seevolume2, chapter4). 

9.17.5 Composing with 1/fNoise 

In their experimentsVoss and Clarke (1978) showed that a 1/f spectral characteristic waswidespread 
in thestructureof music. They conjectured thatcompositionscreated with a 1/f spectral characteristic 
would sound the most like music. To test this hypothesis, they synthesized melodies of three types 
using a Computer: the firsttypemade tone and rhythmic selections with a uniform 1 if° noise gener¬ 
ator, the second type used an fB m 1/f 1 noise generator, and the last used a B rownian 1/f 2 noise gen¬ 
erator. For each generator type, they created melodies of two octave compass, using pentatonic, 
diatonic, and chromatic scales. They only conducted informal listening tests, butthey reported that 
the consensus of listeners was that the fractally generated examples sounded the most like music. 

M andelbrot’s (1977) reaction to the work of Voss and Clarke was to note that "[music] teachers 
insistthatevery pieceof music [should] be'composed’ downintotheshortestmeaningful subdivi- 
sions. The result is bound to be scaling!" (375). Though the work of Voss and Clarke hasdrawnwide- 
spread interest and seems self-evident, it has been subjected to some skeptical analysis by, among 
others, the musicologist N igel Nettheim (1992), who soughtto evalúate and confirm their results. 



360 


C hapter 9 


N ettheim complained, for example, that analyzing Iong swaths of musíc broadcast over a radio 
would combine spectral contributions from many composers and ages, including announcer's 
messages, commercials, and other extraneous non musical material. For his own observad ons, he 
limited the analysis window to the duration of individual musical works and observed greater 
diversity of spectral tendencyfordifferentkindsof music. Healsofound that the fractal dimensión 
was often closerto 2 (Brownian) than 1 (fractal). Nettheim's results were extended by Boon and 
Decroly (1995). Neither Nettheim ñor Boon and Decroly refuted the basic premise of Voss and 
Clarke that there is an approximate self-si mi lar structure to the power spectrum of music at low 
frequencies, but they showed that there i s greater spectral variation, and pointed the way to more 
rigorous application of the technique in musicology. 

Plato said, "Forwhen there are no words(accompanying music) itisvery difficultto recognize 
the meaning of the harmony and rhythm, orto seethat any worthy object is imitated by them." 13 

By "any worthy object," Plato meant any natural object. To Voss, the appearance of fractal 
structure in music bolstered thetheory thatart imitates nature. This idea has been championed in 
virtually every age from the ancient G reeks to the present. But few natural processes seem to be 
inherently musical. So the question arises, If art imitates nature, exactly what is being imitated? 
Voss's answer is that musical signáis, likeso many other biological and natural signáis, reveal a 
self-si mi larcharacter. 

9.18 MonteC arlo Methods 

Lejaren Hiller and Leonard Isaacson (1959) are generally regarded as thefirstto seriously study 
composition of music with computers. They used the llliac Computer atthe University of Illinois 
to create an experimental composition entitled llliac Suite for String Quartet in 1957. As with 
Xenakis's work, chance techniques play a large role in this work, though for quite different pur- 
poses. Hiller and Isaacson (1959) write, 

Theprocessof musical composition can becharacterized asinvolving a series of choicesof musical elements 
from an essentially limitless variety of musical raw materials. Therefore, because the act of composing can 
bethought of as the extraction of orderoutof achaotic multitudeof available possibilities, itcan bestudied 
atleastsemi-quantitatively by applying certainmathematical operationsderivingfrom probabilitytheory and 
certain general principies of analysis incorporated in anew theory of communication called informationthe- 
ory. It becomes possible, as a consequence, to apply computers to the study of those aspeets of the process 
of composition which can beformalized in theseterms. 

Hiller and Isaacson wanted to use computers to model the composing process itself unlike 
Xenakis who saw them merely as an aid to human composers. So Hiller and Isaacson's investi- 
gation wasconducted inthethen-novel field of cyberneti es. Theirapproach was to reduce the rules 
of variouscompositional styles— rangingfrom rudimentary speciescounterpointtofreeatonality— 
into a set of numeric determinants thatcould be incorporated into programs running on the llliac 
Computer. 



C omposition and Methodology 


361 



A ny techniquethat uses probabi I ity to study compl ex Systems can be cal leda Monte Cari o method. 
These methods were so named because of the similarity of probabilistic simulations to games of 
chanceand becauseM onte Cari o, the capital of Monaco, wasfamousforgambling. These techniques 
arenow used in many of the physical Sciences. Hiller and Isaacson pioneered their use in music.Two 
notable methods they used are the random sieve method and M arkov chains. 

9.18.1 Random Sieve Method 

W ith this method, choices made by a random number generator are accepted or rejected depending 
upon whether they obey certain rules, rather as a sieve strains out some objects and allows others 
to pass. One versión of this method is outlined in figure 9.33. We begin at the Start State, and if 
weare notdone, wegeneratea random valuethatwesubjectto tests. If itpassesall tests, weaccept 
the new valué, and if weare not done, wego on to thenextchoice. If itdoes not passall tests, we 
check to see how many times in a row we've failed to pass the tests. If we've failed so often that 
we believe we are stuck, we abort the process and restart. If we're not stuck, we try again with a 
different random choice. 

For instance, ExperimentOne and ExperimentTwo (as movements of the llliac Suite were 
called) were based on the rules of species counterpoint that wereformalized by Fux (1725) in his 
work Gradusad Parnassum. Fux's method is still widely taught in counterpoint classes today. 

H iller and I saacson expressed F ux's rules in numerical terms that could be represented i n a Com¬ 
puter program. If a random choice would construct a harmony that violates the encoded rules of 
counterpoint, for example, movement by paral leí or direct unisons, fourths, fifths, or octaves 
(figure 9.34), then their system would discard the choice and try again until no rules were vio- 
I ated. T he successf ul choi ces were then appended to the end of the musi cal composi ti on bei ng gen- 
erated, and the process was repeated until the composition was of the desi red length. They 
conducted numerous tests of this kind at each step of the composi ng process. 



362 


C hapter 9 


Parallel Direct Paral leí Direct 
octaves octaves fifths fifths 



Figure 9.34 

Parallel motion. 

Theyfound thatforcomplex rule sets the situation can sometí mes arisewhere there is no choice 
that doesn’t break some rule. Asa trivial example, suppose one rule establishes a range that the 
melody must lie within, but another requires it to skip outsideof this range. The program would 
neverfinish running becausenosolution exists. Theapproach taken in figure9.33 allowsthe pro¬ 
gram to restart if the number of unsuccessful triáis exceeds some predefined threshold. 

9.18.2 Backtracking 

A finer-grained recovery technique is to backtrack if forward movement seems impossible. To do 
so, the current choi ce that appears to be stuck i s aborted and the previ ous choi ce i s repeal ed as wel I, 
forcing a new choice for the previ ous State. Then progress is attempted from there. If this still 
doesn'twork, thenextpreviouschoiceisrepealed, and soforth. Stanley Gilí (1963) wasapparently 
thefirstto demónstrate the use of backtracking forcomposition in a 1963 piececomposed for the 
BBC in thestyleof Arnold Schoenberg. 

Gilí avoided stalemates between conflicting rules by prioritizing them. When evaluating thesuit- 
ability of a particular choice, his program calculated a score of demerits based on how many rules that 
choice would viólate and how important the violated rules were. The choice with the lowest demerits 
was accepted unless no choice produced a score low enough, in which case the program would back¬ 
track. Gill's program extended a small number (eight) of competitive versionsof a composition in 
progress. At each step, one would be extended by a certain length (one beat), then evaluated for its 
goodness. Versions that were unfruitful were eventually abandoned automatically by his method. 

Prioritizing the rules allowed Gilí to adjust the rate of composition. If the criteria for extending a 
sequenceweretoo severe, the program would makeno progress; if they were too lenient, itwould rap- 
idly produce a composition of poor quality. Hescaled the demerit score at each step by an adjustable 
coefficient that allowed him to medíate the rate of composition. The adjustable coefficient was itself 
determined by a negativefeedback process so that the rate of composition remained reíatively steady. 

9.18.3 Searching 

T he random si eve method generates musi c much as one mi ght try to fi nd one's way through a maze: 
the rules are like the walls of the maze, and the random number generator is how one chooses a 
new direction to try. Backtracking is a strategy for recovering from dead ends. In any event, what 
these methods are doing is searching for Solutions— looking for a way through the maze. 

In general, two search strategies can be used, depending on the purpose (Ames 1983). 



C omposition and Methodology 


ComparativeSearch Wemay beinthemazepurposely in orderto map it. In thatcase, wewant 
to systematically enumérate every possi ble sol utionf rom every possi ble entry pointto every pos¬ 
si bleexitWecan then compare thegoodness(inKnuth'ssenseof thetermjof alI possiblesolutions 
and arrive at the optimal one. In this case, we'd use a deterministic method of choosing a new 
di rection at each step to be sure we traversed the entire maze from every possi ble di rection of 
attack. We'd use backtracking to get out of dead ends. 

Constrained Search We might simply want to exit the maze as quickly as possi ble without 
having to compare all possi ble Solutions. This is a good approach if the time to find a solution 
islimited, if any solution will do, orif webelievegood-enough Solutions are plentiful. Wemay 
be forced to use this approach if the maze is so extensivethat comparad ve search isnotfeasible. 
Composing music and playing chess can bethought of as very extensive mazes indeed, so this 
technique is often used in these cases. We'd use a random method of choosing a new direction 
at each step and empl oy backtracki ng to get out of dead ends. G i 11 ’s method of scal i ng the demer- 
its of each choice is rather like adjusting the height of the barriers of the maze, allowing us to 
jumpoverlow hurdlesto speed prog ress (possi bl y to the detrimentof the qualityof the solution). 

Bach ChóraleHarmonization with Constrained Search Oneof thegold standardsfor mod- 
eling composition with computers is to replícate or create new works in the styleof J. S. Bach's 
389 chórale harmonizations. 14 The chórales were origi nal ly simple unaccompanied melodies that 
Bach arranged in a homophonic chordal styleto besung by church choirs. BecausethestyIeisso 
definite and regular, and because virtually every composition student is required to study them, 
these chórales have become a kind of standardized "laboratory rat" for such tests. 

Kemal Ebcioglu (1986; 1988) used constrained search with prioritized rules and backtracking 
to model composing two-part species counterpoint, and he later used these techniques in his 
impressive program for harmonizing the chórale mel odies of J. S. Bach. Heprogrammed a Com¬ 
puter with general rules about harmonio partwriting based on thetheoriesof Schenker (1935) and 
added specific Information about Bach's chórales using a logicprogramming language. Hecreated 
new chórale harmonizations that emulated Bach's styl every dosel y. Infact, someof Bach's chó¬ 
rale harmonizations emerged verbatim from his System. 

9.19 MarkovChains 

Even if we were able to identify all the rules that characterize a particular musical style (and that’s 
a big "if"), there is still a great deal of difference between music that breaks no rules and music 
that shows taste. Certainly a critical element of a composer's aural sensibility is a sensitivity to 
musical context, butnoneof themethodsdiscussed sofartakethesurrounding music into account 
to determine subsequent choices. 

M arkov chai n techniques are sensiti ve to their immediately precedí ng context, so they 
can create contextual ly appropriate outcomes. M arkov chains use recently chosen States to 



"Oh Su- san- na, oh don'tyoucry forme, Forl comefrom Al - a - bam - a with a ban-jo on my knee. 

Figure 9.35 

Chorus from Oh Susanna by Stephen Foster. 

influencethe probability of subsequent choices. Another advantage of M arkov chains is that the 
rules driving the process can be readily discovered from existing compositions. Thus, it is possi ble 
to use M arkov chains to compose music that is like other music. Harry Olson (1952) used them to 
construct musi cal exampl es that resembl ed the works of the composer Stephen F oster, and H i 11 er and 
Isaacson (1959) used them to composea movementof thellliacSuite. Thetechniqueiswidely used. 

9.19.1 Markov Chain Orders 

M arkov chains are ordered by how much recent history is taken into account when determining 
the next State. Following Olson's lead, let'sanalyzeaStephen F oster song, Oh Susanna, using var- 
i ous orders of M arkov process. B y focusi ng j ust on the chorus of the tune, we can keep the analysi s 
from becoming too long-winded. Figure 9.35 shows the chorus, which has 25 notes (notcounting 
rests), labeled R 0 toR 24 . 

9.19.2 Zeroth-Order M arkov Process 

Since the weighted choice technique (seesection 9.14.4) takes no account of any previ ous States, 
it is defined as the zeroth-order M arkov process, H 0 . Even simple weighted choice is useful for 
matching the static event frequency of data drawn from the real world. 

We create the probability density function for Oh Susanna by counting how many times each 
pitch is visited as a ratio of the total number of notes: 

C D E F G A B 

4/25 5/25 5/25 2/25 5/25 4/25 0 

T he counts are expressed as a f racti on of the total number of notes. A tabl e I i ke thi s of event occur- 
rences is called a histogram. 

Feeding the Oh Susanna probability density function into the weighted choicetechniquewould 
generate a new melody with pitches in roughly the same proportions as Oh Susanna, but the new 
melody would probably have little if any of the musical character of the original. 

9.19.3 First-Order M arkov Process 

Since music unfolds in time, the context of each note consists of the note or notes that precede it. 
If we want to incorpórate context into our analysis, we must study how notes succeed each other 
in the melody. For each note, let’s tabúlate the note that follows it. Wecan distill from this infor- 
mation what the probability of the next note will be, given the current note. 



C omposition and Methodology 


M arkov Analysis We create a first-order M arkov analysis by the following steps: 

1. Catalog the note transitions. We pair each note in the melody with the note that follows it. 
If we letthefirst note (F) bethecurrent note, then thesecond note (also F) is the next note. So 
the first transidon in the melody is F F. If we now make note 2 (F) be the current note, then 
the next note i s note 3 (A). So the second transition is F A. The third transition is A A, and 
so on. 

Th etransition table( table 9.8) tabulates this Information. Each cell stands for a transition from 
a particular current note to a parti cul ar next note. The row i ndexes the current note, and the col umn 
indexesthenextnote. Thus, the first transition, F -> F, isindicated by al in row F, column F.The 
second transition, F A, is indicated by a 2 in row F, column A. The third transition, A A, is 
indicated by a 3 in row A, column A, and so forth. 

2. Tally up the numberof transitions in each cell (table 9.9). Whatweend up with isessentially 
a set of zeroth-order M arkov histograms in the rows. When wego to generate a melody based 

Table 9.8 

M arkov Order 1 Transitions for Oh Susanna 


Next 


10 , 24 
8,18 


6,14 
5,16 


Table 9.9 

M arkov Order 1 Tallies for Oh Susanna 


Next 


Current 



C hapter 9 


on thisanalysis, weselecta particular histogram row depending upon which note isthecurrent 
note. 

3. Converttherowsinto cumulativedistributionfunctions. Firstwenormalizeeach row. Wewant 
to adjust each histogram so that the sum of its probabilities equals 1. (If any row sums to 0, we set 
all elements of that row to 0.) This is shown in table 9.10. 

4. Transform each column i nto a cumulative distri bution function by summing each cell with all 
cells in the row to its right (table 9.11). The table isfinally in a cumulative distri bution format we 
can use to synthesize a first-order M arkov melody. It determines subsequent notes based on how 
probable the transition isin the original melody.Themethod oftraversing this function is the same 
as thatdescribed in section 9.14.6. 

Markov Synthesis When using table9.11 to generatea melody, we pick a starting noteat ran- 
dom from thesamplespace, {C, D, E, F, G, A} (pitch B is ignored becausenothing transitionsto 
orfrom it). Let'smakeF the current note. Table 9.11 shows that there isa 50/50 chance that the 

Table 9.10 

Normalized M arkov Order 1 for Oh Susanna 


Table 9.11 

M arkov Order 1 Distri bution Function for Oh Susanna 



C omposition and Methodology 


367 


nextnotewiII beF orA. (Thismay beeasiertofollow by referenceto table9.10.) SupposeA is 
chosen; it is now the current note. Then there isa 50/50 chance thatthe next note wi II beG or A. 
Suppose G is chosen; it is now the current note. Now it is twice as likely that E or G will be the 
next notethan that A will be. We proceed likethis until wehaveenough notes. Figure 9.36 is an 
example generated automatically from this data set with starting pitch F. Only the pitches were 
synthesized; the rhythms were copied from the original to aid comparison. This method carries a 
hint of the musical character of the original into the synthesized melody. 

A fi rst-order M arkov process asks, G ¡ven the i mmediately precedí ng State x n _ v what is the I i ke- 
lihood that the current State R isx„? This is written using conditional probability notation, 

which isread as "G ¡ven the condition that x n-1 is the precedí ng State, letg be the probability that 
State R equalsx/ 

Directed G raph A notherway to represent the fi rst-order M arkov transid on informad on we have 
developed is to show itasa directed graph, which illustrates theflow of possibil ¡ti es from State to 
State. States are represented by cirelés, and transitions from State to State are represented as ares 
(lines with arrows). The directed graph of the chorus for Oh Susanna is shown in figure 9.37. 



"Oh Su- san- na, oh don'tyou cry for me, Forl comefromAl - a-bam-awith a ban-jo on my knee. 

Figure 9.36 

Oh Susanna chorus synthesized by f i rst-order M arkov process. 



Figure 9.37 

Directed graph of Oh Susanna, fi rst-order M arkov analysis. 



C hapter 9 


Thediatonic pitches of the scale are shown in cirelés. The ares are labeled with their transition 
probabil ities. 

When synthesizing a melody, notice that once we leave pitch F, we can never return to it, 
because no pitch besides F evertransitionsto F. Pitch B is unreachable. M arkov synthesis isfree 
to eyele among the remaining pitches. Because it contains eyeles, it is a directed cyclic graph 
(DCG). If therewere no eyeles in the graph, itwould be a directed acyclic graph (DAG). 

9.19.4 Second-0 rder M arkov Process 

Second-order M arkov analysis basically asks, Given two events in sequence, what is the proba¬ 
bil ity of the nextevent? Weexpress the probabil ity as 

which is read as "Let q be the probability that R equalsx n , given that x n-1 and x„_ 2 precede it in 
sequence." Wecould representthefirstfew second-order transitions for Oh Susanna likethis: 

F :F->A,F :A->A,A:A-> A, A :A->G,A :G->G,G:G->E,... 

Flow many possi ble second-order transitions are there forthediatonic scale? F i rst-orderM arkov 
analysi s i nvol ves two notes (current and next) and so has 7 2 = 49 orderi ngs. Second-order M arkov 
analysis involves three notes (previous, current, and next), and by the rule of enumeration, there 
are V = 343 possi ble orderi ngs. We still want to represent the transitions as a two-dimensional 
matrix so that, as before, each row represents a zeroth-order M arkov density function that deter¬ 
mines the probabi I i ty of the next note. W e can manage thi s by marki ng the rows as the pair of pre- 
vious and current pitches, and thecolumns as the next pitch. Forthediatonic scale, this requi res 
49 rows and 7 columns, still a pretty big table, but to save room we can I eave out any row s that have 
no transitions. 

The analysis is shown in table 9.12. To conserve space, the transition event order and the nor¬ 
mal ized probabi I i ty di stri buti ons are show n i n the same tabl e. F or exampl e, the I i sti ng for the fi rst 
transition, F:F ->A , reads 2 (1.00), which means thetarget pitch A is the second note i n the mel¬ 
ody (countingfrom 0), and the probabi lity of this transition isl.00. Sometí mes more than onenote 
shares the same transition. For example, E :C -> D is shared by notes 9 and 19. 

Figure 9.38 shows an example second-order melody synthesized from table 9.12. The melody 
length and rhythmsarethesameas the original to facilítate compari son, although they could al so 
be synthesized from a M arkov analysis. Note the direct quotation of the original in the first 
six notes. B ecause i t takes more of the precedí ng musí c i nto account when choosi ng the next note, 
melodies created from higher-order M arkov synthesis carry over moreof the exact phrasing of the 
original melody. 

If we start the M arkov synthesis on other than the F :F transition, we enter the analysis matrix 
ata different position, and different patterns are synthesized. Table 9.13 shows a few example note 
sequences generated from beginning table 9.12 at different initial transitions. 



C omposition and Methodology 


Table9.12 

Second-Order M arkov Analysis of Oh Susanna 


Current 



Next 




C 

D 

E 

F 

G 

A B 

D:C 

0 

11 (1.00) 

0 

0 

0 

0 0 

E:C 

0 

9,19(1.00) 

0 

0 

0 

0 0 

C:D 

10 (0.33) 

0 

12, 20 (0.67) 

0 

0 

0 0 

D:D 

24 (1.00) 

0 

0 

0 

0 

0 0 

E:D 

0 

23 (1.00) 

0 

0 

0 

0 0 

D:E 

0 

0 

21 (0.50) 

0 

13 (0.50) 

0 0 

E: E 

0 

22 (1.00) 

0 

0 

0 

0 0 

G:E 

8,18 (1.00) 

0 

0 

0 

0 

0 0 

F:F 

0 

0 

0 

0 

0 

2 (1.00) 0 

E:G 

0 

0 

0 

0 

14 (1.00) 

0 0 

G:G 

0 

0 

7(0.50) 

0 

0 

15 (0.50) 0 

A :G 

0 

0 

17 (0.50) 

0 

6(0.50) 

0 0 

F:A 

0 

0 

0 

0 

0 

3 (1.00) 0 

G:A 

0 

0 

0 

0 

16 (1.00) 

0 0 

A :A 

0 

0 

0 

0 

5 (0.50) 

4(0.50) 0 


Table9.13 

Other Second-Order M arkov Note Sequencesfrom Oh Susanna 

D:C 

D 

C 

D 

E 

G 

G 

E 

C 

D 

E 

G 

G 

A 

G 

G 

E 

C 

D 

c 

D 

E E 

D 

D 

c 

E:C 

E 

C 

D 

E 

E 

D 

D 

C 

D 

C 

D 

C 

D 

E 

G 

G 

E 

C 

D 

E 

G G 

E 

C 

D 

C:D 

C 

D 

C 

D 

C 

D 

E 

G 

G 

E 

C 

D 

C 

D 

E 

E 

D 

D 

C 

D 

E G 

G 

A 

G 

D:D 

D 

D 

C 

D 

C 

D 

E 

G 

G 

E 

C 

D 

C 

D 

E 

E 

D 

D 

C 

D 

E G 

G 

A 

G 

E:D 

E 

D 

D 

C 

D 

C 

D 

E 

E 

D 

D 

C 

D 

E 

E 

D 

D 

C 

D 

E 

E D 

D 

C 

D 

D:E 

D 

E 

E 

D 

D 

C 

D 

E 

E 

D 

D 

C 

D 

E 

G 

G 

E 

C 

D 

E 

G G 

E 

C 

D 



"Oh Su-san-na, oh don'tyoucry forme, Forl comefrom Al - a - bam - a with a ban- jo on my knee. 

Figure 9.38 

Oh Susanna chorus synthesized by second-order M arkov process. 



370 


C hapter 9 


9.19.5 Third-Order M arkov Process 

Third-order M arkov transitions require three notes of context. Thefirst few transitions are 
F : F : A —> A, F : A : A —> A , A : A : A —> G, A : A : G —> G. 

The analysi s is shown i n table 9.14. Asín tabl e 9.12, the transí ti on event order and the normal ized 
probability distributions are shown in the same table to conserve space. 

Figure 9.39 shows an example thi rd-order melody synthesized from table 9.14. A gain, the mel- 
ody length and rhythms are the same as the original to facilítate compari son, although they could 


a ibt 



"Oh Su-san-na, oh don'tyou cry forme, Forl comefrom Al - a - bam - a with a ban-jo on my knee. 


Figure 9.39 

Oh Susanna chorus synthesized by thi rd-order M arkov process. 


Table 9.14 

Third-Order M arkov Analysis of Oh Susanna 


Current 



Next 





C 

D 

E 

F 

G 

A 

B 

C:D:C 

0 

11 (1.00) 

0 

0 

0 

0 

0 

G:E:C 

0 

9,19 (1.00) 

0 

0 

0 

0 

0 

D:C:D 

0 

0 

12 (1.00) 

0 

0 

0 

0 

E:C:D 

10 (0.50) 

0 

20 (0.50) 

0 

0 

0 

0 

E:D:D 

24(1.00) 

0 

0 

0 

0 

0 

0 

E:E:D 

0 

23 (1.00) 

0 

0 

0 

0 

0 

C:D:E 

0 

0 

21 (0.50) 

0 

13 (0.50) 

0 

0 

D:E:E 

0 

22 (1.00) 

0 

0 

0 

0 

0 

G:G:E 

8 (1.00) 

0 

0 

0 

0 

0 

0 

A: G: E 

18 (1.00) 

0 

0 

0 

0 

0 

0 

D:E:G 

0 

0 

0 

0 

14(1.00) 

0 

0 

E:G:G 

0 

0 

0 

0 

0 

15 (1.00) 

0 

A: G: G 

0 

0 

7 (1.00) 

0 

0 

0 

0 

G:A:G 

0 

0 

17 (1.00) 

0 

0 

0 

0 

A:A:G 

0 

0 

0 

0 

6(1.00) 

0 

0 

F:F:A 

0 

0 

0 

0 

0 

3(1.00) 

0 

G:G:A 

0 

0 

0 

0 

16(1.00) 

0 

0 

F:A:A 

0 

0 

0 

0 

0 

4(1.00) 

0 

A:A:A 

0 

0 

0 

0 

5 (1.00) 

0 

0 



C omposition and Methodology 


371 



Figure 9.40 

Degeneratecycle. 

also be synthesized from a M arkov analysis. The transition probabilities are now so constrained 
thatamajorchunkof theoriginal melody isquoted (motiveA in thefigure). Only thelastmeasure 
(motive B) i s di fferent. T o see how thi s carne about, note that the M arkov synthesi s si mpl y repeated 
themelodic fragmente. So B is really just partof C, which ispartof A. What happened? 

Basically, we hita eyele inour analysis where a State returns backon itself (see section 9.19.3). 
Cyclesthatcan’tbeescaped once they are entered aredegenerate.Thegroup {C, D, B} in figure9.40 
is a degenerate eyele. The other States are cyclic but not degenerate. This becomes an increasing 
problem with higher-order M arkov synthesis. 

9.19.6 A/th-0 rder M arkov Process 

The general form of an Wth-order M arkov process can be expressed as X = M w (x), where Misa 
M arkov anal ysi s f uncti on of order N , x i s the sequence to be anal yzed, and X i s the set of probabi I i ty 
distribution functions that result from the analysis. Wth-order M arkov synthesis can be expressed 
as y = M j^(X) wherey isthe melody synthesized by the process. 

As the order increases, we're more likely to get significant chunks of the original in the syn¬ 
thesized melody. Atasufficiently high order, depending upon the material, we wiII get the entire 
original. This happens for our example melody with fourth-order M arkov synthesis. Thus, 
although arbitrary-order M arkov processing is theoretically possi ble, for most realistic applica- 
tions, analysis beyond about the fourth order may not be particularly meaningful. 

9.20 C ausality and C omposition 

On hearing Lejaren Hiller's llliacSuite, John Pierce(1983) reported that it "sounds pleasant, but 
it wanders, and so does the listener's attention." M arkov techniques are strictly reactive to the 
immediately preceding eventsand do not lend themselvesto following an overall plan. 




372 


C hapter 9 


11 i s worth pausi ng for a moment to I ook at our assumpti ons about the rol e of causal i ty i n musí c. 
A system is causal if it referencesonly currentand past inputand pastoutput. Causal Systems may 
not reference future input or current or future output. That is to say, a causal system can't know 
the future. (These i deas are formal ized i n the discussi on of the canonical fi I ter i n vol ume 2, chapter 5.) 
Certainly, I isteni ng to music we’venotheard before isa causal process: we can’t know whatwe’ll 
hear until we hear it, and we can't know our reaction until we have it. 

11 i s easy to assume that because I isteni ng i s causal that composi ng must somehow be, too. Some 
forms of composi ti on, such as i mprovi sation, are pri marily causal, and many of the techni ques di s- 
cussed in previous sections, especially M arkov chai ns, give the impression that composi ng 
starts with thefirst note and proceeds to the last in a direct sequence. This is hardly ever the case 
in practice. 

If a role of the composer i s to mani pul ate expectation, then the composer must be of two mi nds, 
one part imagining what the listener's expectations will be intime, and theother part keeping a 
"timeless" plan of thecomposition in mind. We may think of such a plan as a static design, like 
an architectural drawing of abuilding. Butany such plan is itself the result of thecomposer'spur- 
suing an underlying goal: the aim, motive, or reason why the composer is writing the music. The 
act of reduci ng one's vi si on of a composi ti on to a fi ni shed score i s a tel eol ogi cal process, a process 
that works backward from the composer's goal. 

Composer Herbert Bielawa and Paul Cranerdeveloped ateleological process to automatically 
composechoraleharmonizations. 15 Early in hisharmony theory teaching career, Bielawahad stu- 
dents tal ly the types of chordal root movements in the E uro-classic music they were studyi ng. N o 
matter what the piece was, as long as itwas Euro-classic, they would come up with very similar 
graphs. Heeventually boiled it down to a rule: good progressions are up a second, down a third, 
and either way afourth orfifth (dominantto tonic). 16 The method Bielawadeveloped to embody 
thisrulewasinessenceasimplefi rst-order M arkov process, but with a twist. To overeóme the ai m- 
lessness of M arkov chainsand sol ve sometricky problems with cadencing, his program composed 
backward, beginning with the final cadenee. 

Ordinarily, one would want "good” root movements (up afourth ordown afifth) to be selected 
mostoften and down a third less often. And although "bad” root movements were rare, they did 
happen occasionally in real music, so these also had small but nonzero probability. However, to 
compose the music backward, Bielawa had to flip the probabilities so that all the "good” root 
movements had to betemporarily "bad” ones, and viceversa. Ultimately, retrograding thegener- 
ated composition automatically made good progressions out of the bad ones. 

9.21 Learning 

H iller and Isaacson's experiments were in the spirit of research efforts to embody expert knowl- 
edge about real-world problems in Computer programs, known as Artificial Intelligence (Al). 

Theclassical Al approachattemptstoreducethesubjectknowledgedomaintoitsessential rules, 
much as Fux did for counterpoint. Based on whatwe’ve seen of M onte Cario techniques, at least 



C omposition and Methodology 


373 


thefollowing difficultiescan beidentified with using rule-building Systemsto model intelligence: 

■ Determining appropriaterules can bedifficult(or impossible), even for experts, iftheknowledge 
is not available to consciousness. Rule-building Al techniques are difficultto apply to subjective 
elements such as taste, preference, and style. 

■ U nlike people, rule Systems cannot of themeslves adapt to new Information or incorpórate new 
rules. That i s, they cannot I earn of thei r own accord but requi re the i ntrospection and programmi ng 
ski II of trained experts. 

■ The more rules thereare, the harderit can be toadd new rules without breaking or distorting the log- 
ical structures already encoded. Aswesaw with therandom sieve method (section 9.18.1), itiseasy 
to introduce rules that contradict each other. The System becomes more fragüe as rules are added. 

■ Trué expertise means knowing the rules of a discipline as well as the exceptions that prove the 
rules. So there must be rules about when the regular rules apply and when they don’t. This calis 
for metarules that enable or disable other rules i n certain contexts. This leadsto hierarchies of rule 
Systems. As a System of rules grows more complex, it becomes progressively harder for it to 
changeor adapt to new Information and novel circumstances. 11 becomes more brittle as the depth 
of hierarchy increases. 

Capturi ng real-l ife experti se by compiling listsof rules tendstocreate rule Systems that are brit¬ 
tle and fragüe. In contrast, human knowledge remains relatively flexible in the face of novel 
i nsights and developments. Itdoesnotseem likely that human learning happensby piling up lists 
of rules. If that weretrue, then the more we know, the longer it would take us to react to circum¬ 
stances, assumingsomefi nitetimeto evalúate each rule. Rule-based classi cal Al i s not a very prob¬ 
able model of human cognition. 

Whatever musical knowledge is, itcertainly seemsto arisefrom experience and is thus learned. 
Weappeartolearn music by using cognitive strategies that are builtinto our brains. Weapply these 
cognitive strategies to our experi ence of music, and somehow the result is knowledge of music. 
From this knowledgearisesaffinity for certain forms of music, and musical tastearises. What are 
these cognitive strategies? What is learning, and how can we model it? 

9.21.1 A Self-LearningGrammar 

Teuvo Kohonen (1989) has described "aself-learning grammar, the rules for which areautomat- 
ically and systematically constructed on the basis of exemplary material." The method, which he 
calis dynamically expanding context (DEC), is like M arkov analysis, but instead of fixed-order 
analysis it uses an order of analysis that grows automatically as necessary to resolve conflicts in 
the rules. Thus general rules are gradually replaced with spedfic ones, mastering the maximal 
degree of complexity with the minimal amount of exemplary materials. This exhibits a form of 
learning because the rules evolve with increased experience. 

DEC isa form of unsupervised learning because no a priori knowledge of music is embedded 
i n the DEC method. However,tofully exploitthe method, it is necessary to careful ly formúlate the 
exemplary material. Like the M arkov process, DEC can bedriven to synthesizecompositions. 



374 


C hapter 9 


The method is best ilIustrated by considering an example that Kohonen provides. Consider a 
melody as a sequence of musical elements to which letters have been assigned: 

ABCDEFG . . . IKFH . . . LEFJ .... 

A s w i th fi rst-order M arkov anal ysi s, we start by exami ni ng the transiti ons. W e eventual I y noti ce 
that there is a three-way conflict for which Symbol may follow F: it may be G, H, or J. U si ng 
M arkov techniques, wewould assign probabilities to theoutcomes based on their frequency. But 
Kohonen’s approach is to resolve this conflict by enlarging the context. We take the Symbol in 
front of F for additional context (likedynamically jumping toa second-order M arkov analysis for 
just this rule). But there is still a two-way conflict because the successorto E:F couldbeG orj. 
Adding asecond Symbol beforeF fully disambiguatesthethreecases. Whilethird-order analysis 
is required for F, it is overkill for other symbols, such as H, which is fully defined by the 
second-order rule K : F -> FI, and for C, which is fully defined by a fi rst-order rule B -> C . We 
wish to avoid overspecifying the production rules because— as wesaw with higher-order M arkov 
processes—too much context meansthe rules are too specific and rigid. DEC thus dynamically 
expands rules only to the extent requi red to resolve conflicts. 

DEC Analysis Kohonen’s method is to iteratively sean thetrai ning data starting with low-order 
rules and apply progressively higher-order rules to problem cases until all conflicts are resolved. 
When a conflict is observed, the existing rules are marked invalid, and new rules aresubstituted 
that contain more context. Iteration over the input continúes until no further changes to the rules 
are necessary. 

For example, consider rule construction for F. Its first appearance is F G . Because it isnot 
already in memory, we create an entry for it as follows: 

Rule no. Leftpart Rightpart Valid 

1 F G true 

Next we find F FI in the input and observe that it conflicts with rule 1. This requires two 
actions: first invalídate rule 1, then (because it is not al ready i n memory) inserta second-order rule 
for K : F -> FI. M emory now looks likethis: 

1 F G false 

2 K:F H true 

F i nal ly, we fi nd F J in the input, and observing its conflict with first-order rule 1, weenter it 
as a second-order rule: 

3 E:F J true 

FI avi ng exhausted the i nput, we itérate agai n from the begi nni ng. W hen we come to F, we search 
memory and discover the invalid rule F G . We now expand its context by oneorder, creating 



C omposition and Methodology 


375 


a new rule E : F G . However, we now observe conflict between E : F G and E : F J . 
We must invalídate rule 3 and entera new thi rd-order rule 4, as follows: 

3 E:F J false 

4 D:E:F G true 

Aswecontinueto sean the inputfor F, we’ll eventual ly discover the invalid rule3, which we eval¬ 
úate at a higher order and enter as a new rule 5: 

5 L:E:F J true 

Further iterations over the input do not cause any changes to the rules, so we are done, and we 
observe that rules 2,4, and 5 remain val id. 

DEC Synthesis SupposesofarwehavegeneratedthesequenceCDEF.Toextendthesequence 
with a val id next Symbol, wefirst search through memory for fi rst-order rules F -> ?. Wefind 
rule 1, F G , which is i nval id. Fi ndi ng no other valid fi rst-order rules, we try second-order rules 
E : F ? and find rule 3, E : F J, which is also invalid. Having exhausted second-order 
rules, we look for thi rd-order rules D : E : F ? and finally find rule 4, D : E : F -> G , which 
is valid, so the next new Symbol wegenerateisG. 

Like M arkov synthesis, the output is made up of subsequences of the original material, so that 
the flavorof the original is preserved, but not its ordering. DEC synthesis, Iike M arkov synthesis, 
contai ns a random element, but uní ike in M arkov synthesis, the occurrence of successive notes 
does not fol low thei r probabi lities in the i nput. DEC synthesis proceeds as though we always used 
the highest-order M arkov analysis availablefor each rule. Kohonen suggeststhat if the results gen¬ 
erated this way are too normad ve, more vari anee in the productions can be achieved by using 
lower-order rules, ignoring thei r val i di ty. 

9.21.2 TheNatureof Learning 

Onecould say that M arkov techniques and Kohonen’s DEC technique"learn" to recognize the fea- 
tures of the materials they are given. B ut they are unable to general i ze from what they know to what 
they do not. If we study a Corpus of music, say, the fugues of J. S. B ach, we not only learn the i ndi- 
vidual works, but asan automati c by- product our cogni ti ve apparatus al so di sti 11 s out a sense of w hat 
a fugue is, so that if we later hear a fugue by M ozart, we recognize its form; we don’t have to be 
retrained.TheM arkov and DEC techniques, likeall methods consi dered so far, fai I to general i ze at al I. 

There are other important characteristics of natural learning that are also missing, such as pat- 
tern completion, for example, our ability to identify major or minor harmonies from a fragment 
of melody. If I show you a letter that is partially occluded, you are sti11 able to recognize it (fig¬ 
ure 9.41a). If it is too occluded to narrow itdown to one letter, you can stiII easily identify the possi¬ 
bil ¡ti es (figure 9.41b). In my col lege music appreciation classes, the professorwouldoften test our 
knowledge of the musical repertoire we were study i ng by playing a randomly selected excerpt of 
music by "dropping the needle" (a phrase referring to thedays of vinyl records). Even if we had 



376 


C hapter 9 


a) 

n 



b) 

r 


V 

Figure 9.41 


Occlusion. 

Iistened to a pieceonly afew timesortheexcerptlasted no morethan asecond ortwo, wewould 
instantly be able to identify it. It's an incredible skilI our brains have, if you think about it. 

Another difference between natural cognition and the kinds of machine cognition described so 
far isthat peoplecan apply múltiplesimultaneousconstraints, but standard computers act sequen- 
tially. When improvising music, a multitude of constraints opérate simultaneously, guiding the 
musician's choices in the moment. 

Our ability to handle múltiple simultaneous constraints allows us to mediate the influence of 
syntaxonsemantics, and viceversa (M cClelland, Rumelhart, and Hinton 1986). Considerthesen- 
tence"I saw theGrand Canyonflying to New York." Weseethatsyntax constrainstheassignment 
of meaning but does not determine it. We understand through the interplay of múltiple sources 
of knowledge. Such structures of knowledge have been variously called frames (M insky 1974), 
schemata (Bobrow and Norman 1975), and Scripts (Schank and Abelson 1976). But ratherthan 
being static objects in memory, Scripts appear to interact with each other to capture meaning in 
novel situations. How do wedo this? And can machines do it, too? 

9.22 M usic and C onnectionism 

We have all learned skills, such as playing a musical instrument, juggling, or riding a bike, that 
we can do without understanding how we do them. We usually learn and teach these skills by 
exampl e, not by rul e. H ow do w e I earn to i mprovi se musi c? H ow do we devel opa personal musi cal 
style? How do we learn to distinguish the characteristic musical swagger of Beethoven's music 
from Schubert's? Weknow whatweknow, butwedon’t necessarily know how weknow it. Since 
we are clearly able to learn these things, an obvious place to look for Solutions is the brain. 

9.22.1 Neural M odels of Cognition 

Neurobiology has shown that the brain can bemodeled asamassively interconnected setof neurons 
operating in paral leí. Cognitive psychologists and Computer scientists have studied the properties of 
brain modelsusing artificial neural networks. These models store knowledge in the connection 
strengths between simple Processing units, much as our brains store knowledge in the connections 
between neurons. B ecause many neurons are acti ng concurrently i n paral I el, these models are cal Ied 



C omposition and Methodology 


377 


a) .. / b) /TN 

¥ 

Figure 9.42 

Networks. 



Figure 9.43 

Simplefeed-forward network. 


paral leí distributed Processing (PDP) or connectlonist models of cognition. B ecause the connection 
strengths between neurons is quantifiable, knowledge in a network is represented in a quantifiable 
way. Whereas knowledge in rule-based Systems tends to be brittle, knowledge in networks can 
change and adapt as new knowledge is acquired and oíd knowledge is forgotten. 

9.22.2 Artificial Neural Networks 

An artificial neural network issimply an interconnected setof simple computational units (fig¬ 
ure 9.42a). Usual ly, the Processing performed by all units i s the same. Each unit receives inputs 
fromother units and produces asingleoutput, which can beconnected tooneor more other units. 
Each connection between units has a unique strength that can be adjusted, so the influence of 
the units upon each other can vary. I n the network shown in figure 43b, the connection strength 
between units i and j is called w¡¡, and the connection strength between units i and k is w ik . A 
strong positiveoutputfrom i would tend to inhibitk if w^is negativeand to excite k if the weight 
is positive. If the weight w¡¡ iszero, then thedriving unit / has no influenceon thedriven unit y. 

A simple feed-forward network having three layers is shown in figure 9.43. The output of each 
unitisfanned intotheinputofeachunitinthenextlayer. Each unitin the input layerx, isconnected 
through weights M¡¡ to hidden units h¡, which connect to the output unity^ through weights W¡ k . 
The hidden layer isso named because its valúes are notdirectly observablefrom outside the net¬ 
work. (There are always weights on the lines connecting units, but conventionally they are not 
explicitly drawn so asto keep down the clutter in the interconnection diagrams.) 

Theprocessing performed withintheindividual units can beassimpleasjustsummingall inputs 
to produce the output. M ore typical ly, the units wil I usethe sum of thei r i nputs to Índex a nonl i near 



C hapter 9 


function of some kind. The function result is then output from the unit. We can express the input 
to each unit h¡ from the input row of the network x¡ as follows: 

< 9 - 27 ) 

where w¡¡ is the weight from theith to thej'th unit, N¡ is the number of input units, and f is some 
nonlinear function (Dolson 1989). Equation (9.28) also descri bes the connection from thehidden 
units to the output. 

The nonlinearity of the function in each unit is the key to giving neural networkstheability to 
makedecisions. Withoutthisfeature, the output of aunitwouldsimply beproportional toits input. 
A nonl inear function al lows quant/'tat/Ve changes i n the inputto result i n qua/ftat/Ve changes i n the 
output, such as turning a unit on or off. This capability allows neural networks to transíate from 
subsymbolic activation levelsto symbolic knowledge. Thetwo mostcommon choicesfor nonlin¬ 
ear functions are the hard-limiting signum function: 

í x< 0, -1 

sgn (x) = 4 x = o, o (9.28) 

[ x>0, 1 

and the soft-limiting logistic function (figure 9.44): 

f(x) = —-—. Logistic Function (9.29) 

1 +e-* 

The logistic function played a role in the development of modern neural network theory because 
it was a component of the proof of an important neural learning technique, back propagation 
(Rumelhart, Hinton, and Williams 1986). In practice, it isjust one of many possi ble "squashing 
functions" that map real valúes into a bounded interval. 

There are many waysofconnecting units. I f there are only feed-forward connections (figure9.45a), 
there are no loops in the network, and the computation of the output is fairly straightforward. If 
there are feedback connecti ons (figure 9.45b), computation of the output can get complicated 



Figure 9.44 

Logistic function. 




Figure 9.45 

Network topologies. 


because the output of unit/A could depend upon unit B, which depends upon C, which in turn 
dependsuponA,andsoon. Feedbacknetworksmay be partly interconnected (figure9.45b) orfully 
interconnected (figure 9.45c) so thatall outputs gotoall inputs. 

Feedback networks can go into oscillation unless they are carefully designed. Forevery recur- 
rentnetwork,thereisacorrespondingfeed-forward network (M insky and Papert 1969),soI focus 
on feed-forward networks here. 

Finally, thereisthequestionof assigning weightsto theconnections. This is wherethi ngs get inter- 
esting. It is possible to assign weights directly to configure a network to perform a particular 
calculationifweknow whattheweightsshouldbe. Moreoften,wedon’tknow whatweightstoassign, 
butwewould likethe network to discoverthem. Some networks allow a supervised learning method 
to beemployed that automatically adjusts the weights in the network until a training pattern applied 
to the network's i nputs produces the desi red val ue on the output. T hese networks can I earn to produce 
a desi red outcomefrom a pattern that is applied to their inputs. Weonly haveto show such networks 
what to do, not how to do it. "Fl ere we have a mechanism whereby we do not actually have to know 
how to wri te the program i n order to get the System to do it" (Rumelhart, H i nton, and W i 11 i ams 1986). 

Aswith musical tasteand related subjects, weknow whatwelikewithoutnecessarily knowing 
why we like it. If wecan show a network examples of good and bad taste, then wecan train itto 
shareourtastein music. Oncethese associationsare learned, wecan use the network synthetically, 
to mimic our aesthetic judgments, forexample, asacomponent in acomposing program, orana- 
lytically, to understand the structure of our aesthetic choices by studying the network's solution. 

So far, this is not much differentthan M arkov and DEC techniques, which can al so mimic. But 
what a network can do that M arkov and DEC techniques cannotisspontaneously generalizefrom 
experience. If we show a trained network an input pattern that it has not previously encountered, 
it will makean educated guess based on the examples it has seen so far. Thus, knowledge in a suit- 
ably trained PDP network can retain a degreeof flexibility and adaptability to the unknown. 

Pattern completion is another form of general i zation that networks can perform. If I play you 
a few notes from the middl e of a fami I iar tune, you can general ly pick up the tune and si ng the rest 
of it. Pattern completion is crucial to our experience of music because this is how we perceive 



C hapter 9 


regularity and novelty. Wecanevenmodel creativity asgeneralization if wethinkof ¡tasproviding 
novel responsesto novel conditions. 

9.22.3 ComputingTaste, a Neural Evaluator of Intervals 

Asasimpleexample, let’steach anetworkto appreciateourtastein musical intervals. Of course, 
for a simple task like thiswecould simply writeout atable, such as table 3.5, stipulating which 
intervals wefind consonantand dissonant. Butsuppose we know whatwelikewithout knowing 
why. We provide a trainable network with example intervals and provide additional inputthat 
gives approval or disapproval based on our preferences, which the network will learn. 

Once the network is trained, we can inspect its interconnection strengths to deduce what it 
knows about our preferences, thus aiding our ability to capture hard-to-explain knowledge. How- 
ever, the i nterpretati on of trai ned networks i s usual Iy nontrivi al. N etwork analysi s may be strai ght- 
forward for simple networks and simple problems. B ut for more complicated tasks, the network’s 
solution will tend to be distributed throughout the network, carri ed i n the overal I pattern of acti vi ty, 
rather than being localized in any particular unit, so the network as a whole must be analyzed for 
these cases (Rumelhart, Hinton, and Williams 1986). AIso, the network may not necessarily find 
theoptimal solution. 

The following example uses an effective trai ni ng method called back propagador! of error, 
which is avaiIableforfeed-forward networks Iiketheoneshown in figure9.43. Remarkably, even 
fairly trivial-looking networks like that one can learn and retain múltiple independent facts, just 
as humans can. 

We must specify the significanee of the network's inputs and output, which range numerically 
from 0.0 to 1.0. There are many possibil i ti esto choosefrom, butthe best network designs show 
a clear relation between the problem at hand and the network topology, and use no more units than 
necessary. For this example, we use 13 input units: oneforeach degreeof thechromatic scale 
plus the octave, plus one extra to indícate whether we judge the interval consonant or dissonant 
that is used during trai ni ng. When an input unit's activation is 1.0, that degree of the scale is 
sounding.Theconsonance/dissonancejudgmentcan be represented as a single output unit. Con- 
sonanceisassociated with 1.0, and dissonance with 0.0. Forthepurposes of thisexperiment, let’s 
say that w e' re not aware that the perfect and i mperf ect i nterval s are consonant and the rest are di s- 
sonant (see table 3.5). Instead, wetrain the network with examples of judgment and "discover" 
this. 

Having specified the input and output units, we must decide about the hidden units. Threelayers 
are general ly sufficientto computeanyfunctionof interestwith this method, so although múltiple 
hidden layers can be used, they are not necessary. At present, there is no straightforward method 
to decide how many hidden units to use. Itis general ly bestto choose the smallest number that is 
effective, both to simplify the calculations and to make the result as general as possi ble. So we 
choose two hidden units for this example. 

Overall, we have 13 i nput units, two hidden units, and one output unit, for a total of 16 units con- 
nected by 28 weighted lines. Nextwe must trai n the network. 



C omposition and Methodology 


381 


Training inputs 



Figure 9.46 

Back propagation. 

Back Propagation of Error To start, we set all the weights to random valúes, rather as one 
shufflesa deck of cards beforestarting a game. If wethen apply a pattern correspondíng to some 
musical interval to the network inputs x¡, the output y,.can be computed directly. Each hidden 
unitfy receives the sum of all input units times their respective weights M¡¡. Each hidden unit 
uses this weighted sum as an Índex into the logistic function to produce its respective output. 
T he output unity fc similarly receives the sum of al I hi dden units ti mes thei r respective wei ghts 
W jk (figure 9.46). 

Becausewerandomized the weights, itishighly uní i kelythat the output of the network wi II i ni¬ 
dal lyagreewith ourjudgmentof an interval'sconsonance. Letuscall thejudgmentwe'd preferthe 
network to learn the target, % The network's error in judgment is the difference between 
the network’s actual output and the target: e = y k - T k . If the output unit's activadon level is less 
than the target level, the error can bereduced by connecting the output unit more strongly to hidden 
units that are producing a positive valué. If the output unit's activation level isgreaterthan the tar¬ 
get level, the output unit needs to be connected more strongly to hidden units that are producing 
a negative valué. We can adjust the weights connecting the output unit to the hidden units, but 
thiswill notfix the problem by itself because the hidden units themselves also contributeto the 
error. 

Adjusting the hidden units is more challenging because we don't have expl i ci t target val ues for 
them— only the output units havetargets. Nonetheless, the same basic strategy can be employed. 
We determine the proportion of error each hidden unit is responsible for and adjust weights con- 
necti ng them to the i nput units to mi ni mize this error. The general strategy i s to propágate the error 
backward through the network, adjusting the weights aswegoin proportion to their responsibility 
for the error i n judgment at the output, which ¡show this learning technique carne to be cal led back 
propagation of error. 17 




382 


C hapter 9 



Figure 9.47 

Local minimum. 


We can obtain the total error on a set of k output units by summi ng the squares of the i ndividual 
errors. Squaring the error eliminates the problem of negative errors cancel i ng positive ones. For 
some input pattern p, the mean squared error is 

e p = ¿ ( ^~ x * )2 ' 

Every time we apply pattern p to the network, we compute e p , then make small ¡ncremental 
changes to the weights to reduce e p . We continué the process as long as each step continúes to 
reduce e p . Eventually, for a well-designed network, e p will become quite small. When it has 
reached a predetermined threshold, westop the training. 

There are things that can prevent e p from becoming as small as desi red. The network may not 
be abl e to f¡ nd a solution if ithas fewer degrees of freedom than the problem space. There may not 
be enough hidden units, or the problem may not be suitablefor the type of network chosen. 

A subtler difficulty can arise where the back propagation technique fi nds an answer butfailsto 
find the optimal answer. To visualize this, imagine letting a marbleroll down the sideof a basin 
with ashallow región and adeeper región beyond, separated by a ridge (figure 9.47). Themarble 
might be captured by the upper región of the basin if it does not have enough momentum to ride 
up over the ridge into the deeper región. 

M aki ng smal I i ncremental adj ustments to the wei ghts of the network i s aki n to the marbl e's rol I - 
ing down the basin. Once the network learns a suboptimal solution, it is unlikely to find a more 
opti mal one because we’d have to allow the error to grow for the network to make it up over the 
ridge, but we typically stop training if the error starts growing. Theshallow basin in thisexample 
i s cal I ed a local mínimum. This probl em can be seri ous but i s rarely fatal. T he subopti mal sol ution 
the network finds still may beoptimal enough. Or wecan simply try again with different random 
weights, which would be like placing the marble in a different part of the basin. 

Training the Network to Recognize C onsonance of I ntervals Because there are 13 inputs to 
this network, there are a total of 2 13 = 8192 possi ble interval patternswecould presentto the net¬ 
work. W hilewecould train the network on all possi ble patterns, weshould not need to do so. Neu- 
ral networks notonly learn by examplebut al so general ize from a limited setof examples, so the 
solution they find to a subset of the total pattern space can remain val id for patterns the network 
was not trained on. The network accomplishes this by automatically discovering statistical regu- 
larities in the patterns on which it is trained. 



C omposition and Methodology 


Table9.15 

Weights for I nterval Consonance Leaming Test, Hidden U nits 

M C C, D Et E F Gt G G, A B\, B C 

h 0 -0.69 2.12 2.22 -0.55 -0.84 -0.55 2.22 -0.86 -0.87 -1.03 2.23 2.05 -0.40 

h 1 1.14 -2.98 -2.90 1.21 1.07 1.14 -2.90 1.09 1.01 0.86 -2.90 -3.07 1.38 


ln this example, we train the network on the 13 chromatic diads (intervals of two tones) in 
table 3.5. Weassumethatthe perfectand imperfect i nterval s are consonant, corresponding toan out- 
put of 1.0 and the rest are dissonant, corresponding to an output of 0.0. We use two hidden units. 
When the network has learnedour consonancejudgmentsforthese intervals, we'll "surprise" it with 
morecomplexchordstoseehow well i t can general i ze. If thenetwork has done itsjob, itsjudgments 
about these complex chords should be reasonable, even though it has never "heard" them. 

The network wastrainedto recognize the 13 interval training patterns until the normal ized mean 
squared error of all patterns was below 0.001. This required about 14,300 adjustmentcycles, using 
only a few seconds of real ti me on my notebook computen A t that poi nt, training was stopped. T he 
weights between the inputs and hidden units were as shown in table9.15. 

The rows show the weights connecting the input units to the first hidden unit, h 0 , and thesecond 
hidden unit, h v The weights between the hidden units and the output unit were as follows: 

W 0 1 

y 0 -6.5 7.5 

So, theconnection strengths from the first and second hidden units to the output unit were-6.5 
and 7.5, respectively. 

How the Network L earned theTraining Patterns L et’s apply a couple of training patterns to 
get an idea of how the network solved the problem. 

U nits C and c the octave above are activated. For the octave interval, the network output was 
y k = 0.99 for a targetof x k = 1.0, representing consonance (table 9.17). Referring back to equa- 
tion (9.27), table9.16 shows how thenetwork computedtheoutputfromthetrainingpattern.Theoutputs 
of the two hidden units are 0.25 and 0.93 for this example. The calculation for the output unit is shown 
in table 9.17. The octave produces a strong reading of consonance (y 0 = 0.99 = 1.0) on the output. 

The results of applying thetritone interval to the input units (units C and Gt are activated) are 
shown in table 9.18. Theoutputs of the two hidden units are 0.82 and 0.15 for this example. The 
calculation for the output unit is shown i n table 9.19. Thetritone produces a strong reading of dis- 
sonance (y 0 = 0.01 = 0.0) on the output. 

Analysis of the Network’s Solution RecalI that the weights between consonant intervals and 
the hidden unit 7i 0 are negative, and those between dissonant intervals and h 0 are positive (see 
table 9.15). By contrast, the weights between consonant intervals and the hidden unit^ were 
positive, and those between dissonant intervals and h x were negative. This means that consonant 



C hapter 9 


Table9.16 

T raining the N etwork on the Octave 


Degree 

c c, 

D h 

E 

F Gi G G, 

A B\, B 

C 

Input 

1.00 0.00 

0.00 0.00 

0.00 

0.00 0.00 0.00 0.00 

0.00 0.00 0.00 

1.00 

IstHidden Unit 
Weight 

-0.69 2.12 

2.22 -0.55 

-0.84 

-0.55 2.22 -0.86 -0.87 

-1.03 2.23 2.05 

-0.40 

Product 

-0.69 0 

0 0 

0 

0 0 0 0 

0 0 0 

-0.40 

Sum 

-1.08 






h 0 

0.25 






2d Hidden Unit 
Weight 

1.14 -2.98 

-2.90 1.21 

1.07 

1.14 -2.90 1.09 1.01 

0.86 -2.90 -3.07 

1.38 

Product 

1.14 0 

0 0 

0 

0 0 0 0 

0 0 0 

1.38 

Sum 

2.52 






k 

0.93 






Note: U si ng the si 

jmastheindex into the logistic function produces the result. 



Table9.17 







H idden U nits for the Octave 






Unit 

h o 

h 





Input 

0.25 

0.93 





Weight 

-6.46 

7.54 





Product 

-1.63 

6.98 





Sum 

5.35 






yo 

0.99 






Table9.18 







Training the Network on the Tritone 





Degree 

c c, 

D E t 

E 1 

F Gj, G G, 

A Bt B 

C 

Input 

1.00 0.00 

0.00 0.00 

0.00 i 

0.00 1.00 0.00 0.00 

0.00 0.00 0.00 

0.00 

IstHidden Unit 
Weight 

-0.69 2.12 

2.22 -0.55 

-0.84 • 

-0.55 2.22 -0.86 -0.87 

-1.03 2.23 2.05 

-0.40 

Product 

-0.69 0 

0 0 

0 

0 2.22 0 0 

0 0 0 

0 

Sum 

1.54 






h 0 

0.82 






2d Hidden Unit 
Weight 

1.14 -2.98 

-2.90 1.21 

1.07 

1.14 -2.90 1.09 1.01 

0.86 -2.90 -3.07 

1.38 

Product 

1.14 0 

0 0 

0 

0 -2.90 0 0 

0 0 0 

0 

Sum 

-1.75 






h x 

0.15 







Note: U si ng the sumas the Índex into the logistic function produces the result. 



C omposition and Methodology 


Table9.19 

H idden U nits for the T ritone 


Unit 

Input 

^0 

0.82 

k 

0.15 

Weight 

-6.46 

7.54 

Product 

-5.31 

1.11 

Sum 

-4.20 


yo 

0.01 



intervals makethe sum of h 0 more negative and thesum of h 1 more positive. Also, the weights 
feeding the output unit negate activadon from h 0 but not from h v From the shape of the logiStic 
function, if the sum of the activation from the hidden units is greater than 0.0, the output will 
beturned on, and if itislessthan 0.0, the output will beturned off. So the weights on the hidden 
units have been trained to make the hidden units sum to a positive valué for consonance and a 
negative valué for dissonance. Unith 0 isturned on strongly for dissonance, and h 1 is turned on 
strongly for consonance. Thisisjustwhatwewanted, and wedidn’t have toprogram thenetwork 
to find the solution; it figured it out by itself. 

Testing the NetWork—Can It Generalize? Let's see how well the network generalizes to 
other intervals and chords. A Ithough we only trained it on diads, the network provides encourag- 
ingly good-quality guesses about the consonance of some more complex chords (see table 9.20). 
The di mi nished seventh chord is arguably the only bad guess, but maybe it’s not really so bad after 
all. That chord is considered dissonant because of its tritone, but it can also be viewed as three 
minor thirds stacked up, and the interval of a minor third is considered consonant. 

Sothisworked pretty well. But remember that the network I ooks for stati sti cal regularity, and 
al I our trai ni ng exampl es and test exampl es have the pi tch C i n them as the I ower tone of the i nter- 
val. How does the network handle intervals and chords starting on another degree of thescale? 
Let’s test the fifths between F and C and E and B (table 9.21). Both F-C and E-B should be con¬ 
sonant. The fact that they are not suggests that the network has relied on the scale degree rather 
than the interval to determine consonance. 

Likeall learners, networks tend to search for regularity. But the most regular Solutions are not 
necessarily the best. For example, a child might incorrectly rely on the regularity of English verbs 
and say"I swimmedtoday” instead of "I swamtoday." The network appearsto have stumbled for 
the same reason. Itappearsto haveassociated consonance and scal eposi ti on instead of consonance 
and interval size because our limited trai ning setfailed to provide exampl es that would havevio- 
lated thisassumption.This network hasnotdiscovered all theunderlying reíationsthataccountfor 
our consonance judgments, and so itcan't generalize correctly in all cases. 

We can i mprove the abi I i ty of a network to general ize by i ncreasi ng the rati o of trai ni ng exam¬ 
pl es to hidden units. The greater this rati o, the more the network isforced to generalize. For the 
preceding exampl es, the ratio is 13/2 = 6.5. Reducing the number of hidden units to one is not 



C hapter 9 


Table9.20 

Network ConsonanceGuessesfor Complex Chords 







Pattern 





Chord 

C C, 

D 

Et 

E 

F G t 

G G» A 

B\, B C' 

Output 

Analysis 

Quality 

Major 

triad 

1.0 0 

0 

0 

1.0 

0 0 

1.0 0 0 

0 0 0 

0.99 

Strongly 

consonant 

Good 

M mor 
triad 

1.0 0 

0 

1.0 

0 

0 0 

1.0 0 0 

0 0 0 

0.99 

Strongly 

consonant 

Good 

7th 

1.0 0 

0 

0 

1.0 

0 0 

1.0 0 0 

1.0 0 0 

0.83 

Fairly 
consonant 
despite 
dissonant 
major 7th 

Good 

Dim. 

triad 

1.0 0 

0 

1.0 

0 

0 1.0 

0 0 0 

0 0 0 

0.13 

Fairly 

dissonant 

despite 

consonant 

minor3d 

Good 

Dim. 

7th 

1.0 0 

0 

1.0 

0 

0 1.0 

0 0 1.0 

0 0 0 

0.77 

Should 

notbe 

consonant 

Poor 

Cluster 

1.0 1.0 

1.0 

1.0 

1.0 

1.0 1.0 

1.0 1.0 1.0 

1.0 1.0 1.0 

0.01 

Highly 

dissonant 

Good 

Table 9.21 

Network Performance Starting on Other Degrees 






Pattern 





Chord 

c c, 

D 

Et 

E 

F 

Gt G G, 

A B\, B 

C' Output Analysis 

Quality 

F-C 

0 0 

0 

0 

0 

1.0 

0 0 0 

0 0 0 

1.0 0.99 

Strongly 

consonant 

Good 

E-B 

0 0 

0 

0 

1.0 

0 

0 0 0 

0 0 1.0 

0 0.17 

Should be 
consonant 

Poor 


an option because the network would no longer be ableto learn. But it would be appropriate to 
expand thetraining setto includeall the rest of thediad intervals. If weexpand thetraining set 
to inelude every diad on every possi ble scale degree, we have80 training patterns, 44 consonant 
and 36 dissonant. This is still a small fraction of the 8196 total intervals. In practice, the míni¬ 
mum number of hidden units thatcan sol ve this set of training patterns appears to befour, for a 
training ratio of 80/4 = 20. With these adjustments, the network correctly handles all the 
judgments. 



C omposition and Methodology 


9.22.4 Generalization as C reativity: C omposing with Networks 

To use a network to understand musical structure in time, we must have a neural representadon 
of time. Let's say we wanted the network to learn melodies. One approach would be to have as 
many network inputs as there are notes in the longest melody. Or the network input could be a 
fixed-size time window that si ides over a región of the melody. In either case, this kind of win- 
dowing approach represents time as position and converts the problem of learning music into 
learning spatial patterns. 

For example, we could train a network such that when one measure is played, the network pro¬ 
duces the next measure in sequence. Or we could train a network to generate the next note in 
sequence by supplying it with some number of previous notes for context. This would require a 
feedback arrangement in the network design so that previous outputs could influence subsequent 
choices. Thewindowing and context methods could becombined so that the feedback unitspro- 
vide context for whole musical phrases. This could be used to study the motivic structure of mel¬ 
odies, for example. 

PeterTodd (1989) describes a process whereby a network was trained using the feedback con¬ 
text method to learn a set of melodies. H is approach used the back propagad on method but al so 
included a set of feedback units that stored context information about the notes played most 
recently (figure 9.48). Once trained, it could play back the melodies when keyed to do so by a set 
of plan network inputs that acted like the buttons on thefront of ajuke box to selectthe desi red 
melody. 

First, he trained the network to play several melodies correctly. H e then experimented with set- 
ting the plan inputs to untrained valúes so as to forcé the network to general i zefrom the melodies 
it was trained to reproduce and thereby to compose new melodies. In this way, Todd used gener- 
alization as a model of creativity. 




C hapter 9 


Toddusedvery simplefolk melodiesas training examples, buthecould easily haveusedanything 
el se, includingexamplescomposed by a stochastic processorsomerule-basedapproach. Although 
Todd's network learned only the surface of the melodies, it would be straightforward to extend it 
to a hierarchical setof networkssuchthatalow-level network responsibleforthenote-by-notepro- 
cess interacts with higher-level networks responsiblefor an overall compositional plan. 

9.22.5 Bach C horale H armonization with C onnectionism 

A commoncriticismofconnectionistresearch isthatneural networktechniquesseemtoworkwell 
on reíatively simple proof-of-concept problems but do not scale well to realistic-sized problems 
traditionally studied in Artificial IntelIigence such as playing chess and composing music. The 
challengeof composing real ¡Stic music with neural netswastaken upby Hild, Feulnerand M enzel 
(1991), who developed HARM ONET, a program to harmonize chórale melodies in the style of 
J. S. B ach. 18 Their aim was not only to demónstrate parity with more traditional AI techniques but 
to exploitthe potential fornet-basedsolutionstogo,asitwere, beyond the rules and penétrate more 
deeply into the core of a composer’s style. 

I n fact, thei r approach turned out to be a hybrid of Symbol ic expert system for some parts of the 
problem and neural networks for other parts. In particular, theydidafai ramountof manual parsing 
of the chórales to structure the data to create their training set. Then they trained a network with 
this set to create a "harmonic skeleton" of several chórales. The chórale melody then provided the 
soprano line, and the harmonic skeleton provided a bass line. Then they had to synthesizethe alto 
and tenor Unes, which they did using a standard AI "generate and test" approach. Last, they added 
passing eighth-note figures characteriStic of Bach's style using another network. AII networks 
used astandard back propagad on are h i tectu re w i th context units to remember recent events, sim¬ 
ilar to Peter Todd's approach. 

Because Hild and his colleagues don’tjust use networks throughout, it's notelear that this isthe 
breakthrough realistic-sized problem for connecti oni st research in music. Nonetheless, they stated 
thatan audienceof music professionals had determined HARM ONET'soutputto be"on thelevel 
of an improvising organist," and indeed printed scores of their harmonizations seem quite good. 

9.22.6 Genetic Programming 

We have consi dered compositional processes over the last thousand years of human history. Time 
and again, weseeatrade-off betweengenerating music anderitiquing music. WeseeitatalI levels 
of the process, from thesmallest local detai I of a prívate actof composition to the most public pro- 
nouncements of music critics. 

In every age, composers put forward their ideas in the context of culture, and critics evalúate 
them in the same context. Successful works, ideas, and methods survive; unsuccessful ones are 
scrapped and forgotten. Both composition and criticism adaptto cultural changes. Successful 
adaptation may mean reproduction (in the sense that children are reproduced from their parents), 
crossover (swapping elements between successful adaptations the way parents pass their charac¬ 
teri sti es alongto their children), mutation (where novel elementsareintroduced),permutation,and 



C omposition and Methodology 


other reordering processes. W hat composers do i n subsequent days and i n subsequent epochs can 
general ly be seen as an evolution from antecedents. Thus, composition can be likened to a natural 
selection process. 

Any process that we can identify we can al so model, and a useful computational model for 
this view of composition is provided by genetic programming (Koza 1992). This technique 
adapts some of the principies of biological natural selection to allow programs to evolve 
spontaneously. 

Supposewestartwith a collection of primitivefunctionstogenerateand modify basic musical 
data (such as algorithms to generate and transform a tone row). These are supplied to a genetic 
programming system, which creates a population of programs that invoke these primitive func- 
tions in various random ways. The genetic programming system then executes the population of 
programs, and their results areevaluated for how well they succeed. This critique is provided by 
yet another function we must supply that determines how well the programs perform their task, 
that i s, thei r fi tness. B ecause they were generated randomly, most of the programs probably won't 
perform very well, but we take those that perform bestfor subsequent development and discard 
the rest. 

A new setof programs iscreated from those that survi ved the previ ous round by reproduction, 
crossover, mutation, permutation, and so on. These aretested as before, and the process repeats 
until some criterion of fitness is achieved. 

Thegood newsis, thisapproach, Iikeartificial neural networks, avoids the requirementofknow- 
i ng what the sol ution shoul d be i n advance. T he bad news is that the sol uti ons may not be opti mal; 
and for realistic-sized problems, Solutions may not be scrutable (see especial ly Todd and 
Werner 1998). 

9.22.7 Summary of C onnectionism 

A promised advantage of artificial neural networks is that the composer need not invent rules to 
express preferences. Such preferences are an emergent property of the network. The fact that no 
music theory is implied in the structure of a network is a benefit because it allows any theory 
embodied in a model to arise. Theability of a network to general i ze from examples provides the 
composer with waysto go beyond the model in a musical ly reasonable way. 

Such networks can be used to study the psychophysics of sound, the perception of timbre, 
pitch, and rhythm, tonal analysis, musical instrument fingering, sound synthesis, automatic 
music elassificatión, recognition directly from thewaveform, emotion in music, musical phras- 
ing and interpretation, automatic music manuscript transcription, and many other areas (Todd 
and Loy 1989). 

But both conventional AI and connectionist approaches seem to run outof power when scaled 
up to the size of problems we'd like them to be able to solve. Perhaps hybrid Systems, such as 
HARM ONET, that combine conventional Al techniqueswith artificial neural networks wi II even- 
tually succeed where thetwo approaches separately havefaltered. Or perhaps we've simply not 
found the right model for intelligenceyet. 



C hapter 9 


9.23 Representing M usical K nowledge 

T he terms arri va/ and departure are often used i n musí cal anal ysis because they capture somethi ng 
true about our experience of music. These terms suggest a sense of time and place, and that the 
music conducís us along a pathway structured by the composition. 

Directed graphs embody this sense of place and transition, and we observed the usefulness of 
directed graphsto characterizethe unfoldingof a musical theme (seesection 9.18.3). Petri's (1979) 
general net theory extends the directed graph to characterize causal Systems of arbitrary morphol- 
ogy and abstraction. Antoni and Haus (1982) adapted them to represent musical structure and 
knowledge. Haus and Sametti (1991) describe a software tool, ScoreSynth, for analyzing and 
synthesizing musical scores using Petri nets. 

9.23.1 Petri Nets 

Petri nets look like directed graphs but with additional elements. As with directed graphs, States 
are represented as circles, and transitions between States are represented by the movement of 
tokens along ares connecting States (see figure 9.37 for an example of a directed graph). But with 
Petri nets, múltiple tokens flow through the net simultaneously. Transitions in the network State 
can trigger other actions, suchas causing transitions to occur in subnets, nested hierarchically. The 
f I ow of ti me can be made expl i ci t i n Petri nets. T hey can handl e determi nistic and nondetermi nistic 
(stochastic) operations. Representadon of music structure with Petri nets is compact and expres- 
sive. The elements of a Petri net can referto musical objeets such as notes, phrases, motives, sec- 
tions, and the like, or they can refer to nonmusical objeets that manage and control the 
compositional process. 

The basic Petri net elements are places, transitions, and ares (figure 9.49). Places and transitions 
are connected by ares. Two numbers may appear inside a place. The upper number (n) indicates 
how many tokens it currently contains; the lower (N) indicates the máximum number of tokens 
it may contain. Transitions control the movement of tokens i n the network. Places and transitions 
can also contain subnets. 


pi 

#of tokens /’ñ'N 
Token capacity \ nJ 


O 


Named place Named place 

containingn representing a 

tokens out of subnet 

a máximum 
of N 


Transitions Are 


ti t2 

0 o — 

Named Named transition An are with 
transition representing a multiplicity 


Figure 9.49 

Petri neticons. 



C omposition and Methodology 


391 


TheFiring Rule Theexecution of a netisdetermined by thefiring of itstransitions. The basic 
ruleforfiring is asfollows (Haus and Sametti 1991): 

A transition may fireif eachoneof theinputplaces, i.e., places which areconnected withoriented aresto the 
transition, has at leastonetoken. The transition firing has two effeets: to decrement the marking of each input 
place by onetoken and to increment the marking of each output place, i.e., a place which isconnected with 
an oriented are from the transition, by one token. 

After the starting of a net, fi rings follow one another until there are no more transitions which may fire. At 
the end of transition fi rings, the execution of the net stops (6). 

For example, figure 9.50 shows an elementary sequence. Initially, the input place pl contains 
a single token, and the output place contai ns none. Thefiring rule i ndicates that ti can fire. Itdec- 
rements the token count in pl by 1 and increments the output place by 1. This basic firing rule is 
extended by the follow i ng additional rules. 

Capacity Each place in a network can be assigned a máximum capacity of tokens, represented 
asthelowerof the two numbersindicated inside places. (If no number is indicated, 1 isassumed.) 
"Transitions cannot fire if the marking of one output place, at least, will exceed its capacity after 
transition firing" (Haus and Sametti 1991, 6). 

Figure 9.51 shows two examples. I n the first case, p2 is fulI, so ti cannot fire. Thesecond case 
represents a conflict, because only one transition can fire. The network determines whether ti or 
t2 will fire nondeterministically (stochastically). 

M ultiplicity If a numerical label is affixed to an are, called the multiplicity valué, then thefir¬ 
ing rule must be modified: "A transition may fire if each one of the input places has at least as 
many tokens as the numerical label on the are [multiplicity valué] connecting the place to the 
transition." 19 


Befo re After 

pl ti p2 pl ti p2 

(iMMD 

Figure 9.50 

Petri net sequence. 


pl ti p2 pl ti p2 t2 p3 



Figure 9.51 

Effect of capacity on firing. 



C hapter 9 



Figure 9.52 

Exampleof firing with multiplicity. 


pl ti MI 

0 *0 KD 

M 1: play pitch A440 j 

' y \Mus¡cal Musical 

action object 


Figure 9.53 

M usical objects. 

Firing now decrementsthenumberof tokensof each inputplaceby the multiplicity valueonthe 
are connecti ng the place to the transí tion, and i ncrements the number of tokens of each output pl ace 
by the multiplicity val ue on the are connecti ng the transid on to the pl ace. F i gure 9.52 shows a net- 
work with multiplicity ready to fire, and the results after firing. 

Firing happens in threestages: 

1. The network determines that ti can fire because each of the input places has at least as many 
tokens as the correspondí ng multiplicity valué on the are connecting it to the transition. 

2. U pon f¡ r¡ ng, the transí ti on ti subtraets the number of tokens from the i nput pl aces specifi ed by 
the multiplicity valúes on the ares connecting the places to the transition. 

3. T he transí ti onthenadds as many new tokensintheoutputplacesasindicated by the multi plicity 
valúes on the ares connecting the transition to the output places. (M ultiplicity does not conserve 
the total number of tokens.) 

Musical Objects, Musical Actions Wecan associate places with any musical significance we 
like. Places can represent individual notes, phrases, dynamics, motives, and so on. We associate 
musical meaning to places by affixing labelsto them thatrefertodefined musical objects. In fig¬ 
ure 9.53 place M 1 has been designated to be a musical object because its ñame starts with M . 

When tokens flow into or out of a musical object, the associated musical action istriggered. If 
the musical action ¡sanóte, itisplayed; if itisaphrase, thephraseisplayed; if it is a subnet, the 



C omposition and Methodology 



Figure 9.54 

Timed netexample. 

subnetisentered.In figure9.53musical objectM 1 isabouttotriggeritsassociated musical action. 
When ti fires, M 1 will play pitch A440. 

Timed Firing But how long will the note in figure 9.53 last? Petri nets haveno inherent notion 
of timeorsequence. M any transítions can be qualified tofireatthe same ti me, buttheimplemen- 
tation of Petri netsgives no direct control over their order of firing. AIso, theduration of firing is 
assumed to be i nstantaneous. We must add ti me structure to the network. H aus and Sametti (1991) 
tooktheapproach of associating timewith musical objects. "When atoken isputintoaplacewith 
an associated M O [musical object] thetoken cannot be considerad for the firing of transitions con- 
nected to the place until the associated M O has ended" (8). 

To ¡Ilústrate, supposethatmusical objectM A isdefined to lastl second (figure9.54). Everytoken 
put into M A is not disposable for 1 second. We al so define M B to last 6 seconds per token. Tokens 
put into places that are not defined as musical objects are immediately disposable. Forexample, if 
M A has o ne token, then t2 isprevented f rom firing becausethecapacity of M A isl token. Similarly, 
t3 can't fire because M A does not present its token to t3 until its duration of 1 second has elapsed. 
The sequence of fi rings for figure 9.54 is as follows. 

1. ti fires, subtracting one token from pl and adding one token to M B and five tokens to p2. 

2. MB triggersan instance of its associated musical action, which will last 6 seconds. 

3. t2 fires, subtracting one token from p2 and addi ng one token to M A, which is now atcapacity. 

4. MA triggersan instanceof its associated musical action, which will lastl second. 

These steps transpire instantaneously because they are triggered by places pl and p2, not musical 
objects. M B has not reached its capadty, and up to five more i nstances of M B can be triggered. H ow- 
ever, no tokens are available from ti because pl is exhausted. M A has reached its capacity, so t2 and 
t3 are prevented from firing. MA’s token will f i re t3 i n 1 second, so the network must wait until MA's 
token is avai lable. Atthis point, wecan represent musical actions in progress at ti me 0 (figure 9.55). 

5. When 1 second has elapsed, t3 fires, passing thetoken from M A to M B. Another instanceof 
M B is triggered. 

6. MA isnowempty,sot2fires,passingatokenfromp2toMA.AnotherinstanceofMA istriggered. 



C hapter 9 



1 6 
Time 


Figure 9.55 

Musical actionsinprogressattimeO. 


MA: 1=1 


MB: i-1 



1 6 12... 


Figure 9.56 

M usical actions in progress attime 1. 



1 6 12... 


Figure 9.57 

Complete resultof network execution. 

M usical actions in progress attime 1 are shown in figure 9.56. 

From thispointon, M A and M B will betriggered only by theremaining tokens in p2, consumed 
in 1 second intervals by M A, then passed to M B. The complete resultof network execution is 
shown in figure 9.57. 

Refinement M orphisms Petri nets can be developed through a process of refinement, wherea 
place or transformadon can actas a placeholder to be given a more detailed description later. 
"Refinements can define very complex PN [Petri net] models by meansof simple PNs and hier- 
archical structures, i.e., allowing models to be designed by either a top-down or a bottom-up 
approach" (Hausand Sametti 1991,10). 



C omposition and Methodology 


395 


F ather net D aughter subnet 



Figure 9.58 

Petri netwith subnet. 


The placeholder node iscalled thefather and the subnet it refers to is thedaughter. To asso- 
ciate a daughter subnet with a father place, the subnet must have an input place and an output 
place. Input ares to the father place are input ares to the input place of the daughter net; output 
ares from the father place are output ares from the output place of the daughter net. Transitions 
can be refined in the same way. An example is shown in figure 9.58. 

Building Blocks Some of the basic Petri net building blocksformusicareasfollows: 

■ Sequence (monophony). The musical object associated 
with each place is triggered as the token flows along. This 
can be used to form a melody out of its constituent phrases 
or a movement out of its sections. 

■ Parallel (polyphony). A single place triggers musical 
objeets M 1 and M 2 to execute concurrently. This can 
be used to form a chord or to synchronize polyphonic 
counterpoint. 


■ Choice. F rom a si ngle place, oneof two paths can betaken. 
Since there are two paths but only one token, the net must 
(stochastically) choose one path. 



Gy-tt—d) 




C hapter 9 


■} oin. Two i nput places trigger M1, so two i nstances of M1 
are created and run concurrently. 


■ Fusión. Only oneof thetwo ¡nput places can trigger M1. 
Oneinstanceof MI istriggered. 


. Iteration. The Start place is provided with as many 
tokens as the required iterations. There must be room for 
them at the End. In this example, two ¡nstances of M1 exe- 
cute sequentially. 



dMKÍMKD 


Canon Perpetuus AII of these structures can be joined to create more complicated networks. 
For example, A ntoni and Haus(1982)provideasampleanalys¡sofJ.S.Bach's"Canon Perpetuus" 
from hisM usical O ffering forflute, violin, and continuo bass. 20 A brief overview of theflutepart 
wi II be i Ilustrad ve. The pitches of the flute part in thescorecan begrouped into musical motives 
as follows: 


Ñame 

Section 

F x 

bars 1-2 

f 2 

bars 3-10 and thefollowing three notes 

f 3 

last note of bar 11 to first note of bar 13 

f 4 

the rest of bar 13 to bar 14 

f 5 

bars 15-17 and the following note 

Fend 

last note of the flute part 


Using these named sections, the flute part can be analyzed as follows: 

{F 1( F 2 , F 3i F 4i F 5 , /(% F 2 ), rt( F 3 ). F x , F 2 , F 3 , F 4 , F end }, (9.30) 

where/'O indi cates inversión and rt() i ndi cates transposed retrograde. This sequence structure 
is represented by the Petri net shown in figure 9.59. Place Start contains a token, as does place 
Ping. In the beginning only the transition leading from Start can fire. When the token reaches 
F 4 , T 1( can f i re but T 3 cannot. So the token vi si ts F 5 and goes down the right-hand arm i n the fi g- 
ure, eventually reaching T 2 . Both F x and Pong receive a token, but only the transition from F x 
can fi re. So the token visits F x through F 4 again. NowT 1 cannotfirebutT 3 can. So thetoken visits 
F end , then Stop. 



C omposition and Methodology 


397 



Figure 9.59 

Petri netmodel of flute partfrom "Canon Perpetuus.” 

Start R /(F) Stop 

CEHKM-CKKI) 

Figure 9.60 

Petri netmodel of violin partfrom "Canon Perpetuus." 

We observe thatthe violin part of "Canon Perpetuus" is the inversión of the flute part played 
against theflute with atwo-bardelay. Using F to representthesequenceshown in (9.30) except 
for the final sequence {F 4 , F end }, wecan write the violin partas 

R,i(F), (9.31) 

where R is the two bars of rest. The Petri net for this sequence is shown in figure 9.60. 

Petri netsmodel the behavior of discretedynamical Systems such as musical scoresin adirect 
and intuitive way. They handle many concepts that are vital to the musical process, including 
sequence, concurrency, conflict, and resolution. The resulting network descriptions naturally 
facilítate deeper understanding of the underlying musical System. Petri net representadons of 
music can remove a great deal of redundancy in a musical score, revealing the essential structure 
of the work. They have been used to analyze music structures of significant size (Haus and 
Rodríguez 1993). They provide a pragmatic method for hierarchical representadon of musical 
knowledge (Roads 1984). Petri nets can al so be constructed from scratch to synthesize musical 
scores, or nets that are the result of analysis can be modified for subsequent synthesis of related 



C hapter 9 


musical works. However, Petri netscan becomeexplosively largewhen they areused to describe 
realistically complex systems. 

9.23.2 PredícateTransition Nets 

High-level Petri nets, also called Predicate/Transition (PrT) nets, have been developed to over¬ 
eóme theproblem wherePetri netsbecomeunmanageably largewhen used to model realistically 
complex problems (Genrich and Lautenbach 1981). The general ideaisto attach additional infor- 
mation to the elements of the network to increase their descriptive power. 

For example, in Petri nets, tokens are simple featureless counting devices. In PrT nets, tokens 
havequantity and quality. In fact, they can have múltiple quantities and qualities. Algebraic and 
logical expressions (predicates) can beadded to places, transitions, and edgesto describe network 
State and fi ring. The expressions are evaluated based on the available types and quantiti es of the 
tokens flowing through the System. 

Considera singlepl ace, cal I ed p ianoimprov, w hi ch model s the musi cal resources of the p¡ ano 
part of a musical improvisation. Pianoimprov contains a collection of tokens representing 
the pitches the pianist can play. The contení of a place is called its marking. Suppose that 
fianoimprov is marked by two C4 pitches, three E4 pitches, and two G4 pitches. B ecauseit holds 
pitches, we say place Pianoimprov is of typ ePitches. This means oimprov can only con- 

tain elements of type Pitches (figure 9.61). 

When an instance of a place is created, it is given an initial marking. The initial marking of 
fjanoimprov can be expressed as 

M 0 (p i ancTr.pr ov) = (2 • A4) + (3 • B4) + (2 • G4). 

T he marki ngs of a network w i 11 vary as the tokens are consumed and produced across the net duri ng 
operation. 

A n are from a place to a transition carries tokens consumed by the transition, and an are from 
a transi ti on to a pl ace carri es tokens consumed by the pl ace. T he I abel on an are i ndi cates the num- 
ber and kind of tokens that can be consumed or produced. Wecan specify thatany quantity of any 
type of token can travel an are, or we can restrict the are to certain quantiti es and types of tokens. 
The arcx in figure 9.62 is defined to be of type Pitches, and indicates that any number of tokens 


Pianoimprov 
/C4 E4\Pitches 
[ E4 G4) 

VG4 C4 ) 
\^E4^/ 

Pitches: {C4, E4, G4} 


Figure 9.61 

Pianoimprov 1. 



C omposition and Methodology 


399 


Pianolmprov Play 

Pitches * g 

Pitches: {C4, E4, G4} 
x: Pitches 

Mo (Pianolmprov) = 2 ■ C4 + 3 ■ E4 + 2 ■ G4 

Figure 9.62 

Pianolmprov 2. 


Pianolmprov Play 


O 


Pitches: {C4.E4.G4} 
x: Pitches 

M 0 (Pianolmprov) = 2 ■ C4 + 3 ■ E4 + 2 ■ G4 
Mo (Listen) = 0 

Figure 9.63 

Pianolmprov 3. 


of type Pitches can beproduced by Pianolmprov and consumed by transition piay in a single 
transaction. By this rule, one pitch, any combination of pitches, or all pitches can be played by the 
piano atonce. 

In figure 9.63 Pianolmprov is ready to play any of its pitches for jiste%fgisten begins in 
an empty State. However, by the rule stated beneath the piay transition, piay can only fireif the 
received token is the pitch C4, so Listen can only hear that pitch. 

The network shown in figure 9.63 can be defined as follows: 

P ={Pianolmprov, Listen} 

T = {Play} 

F = {(Pianolmprov, Play), (Play, Listen)}, 

where P is the set of places, T is the set of transitions, and F defines the ares that connect places 
and transitions. Becausearcs link places and transitions, F isa listof place/transition pairs. 

The combination of typed elements with capacities and predicates makes PrT nets more 
expressiveand representad onal I y compactthan Petri nets. Pope (1986) givesan exampleof the 
use of PrT nets in a musical context and shows how PrT nets can be abstracted to become the 




400 


C hapter 9 


témplatefor deriving other related networks. In this way, PrT networks begin to take on some 
of the characteristies of Object-Oriented Computer programming languages, such as type inher- 
itance and abstraed on, but w i th the advantage of bui I t-i n faci I i ti es for model i ng the behavi or of 
discrete dynamical Systems such as musical scores (Pope 1991). In fact, graphical simulation 
techniquesand high-level Computer languagedesign are beginning to converge to thepointthat 
practical tools for modeling and emulation of discrete dynamical Systems are now commonly 
available. 21 

What this approach lacks is a built-in mechanism for learning, abstraction, pattern completion, 
and spontaneous general i zation provided by the connectionist approach. Since both PrT netsand 
connectionistframeworks model dynamical Systems, perhaps a hybrid approach combining ele- 
ments of both would prove sufficiently expressive for problems of realistic scale. 

9.24 Next-G eneration M usikalische Würfelspiel 

The tables used to generat eMusikalische Würfelspiel compositions were each predetermined by 
a master composer, so a composition in the style of that composer is guaranteed if the method is 
fol I owed.T he composer David Cope (2001) has devel opeda set of programs he calis Experiments 
in M usical IntelIigence (EM I) that has a similar aim: EM I produces original works in the style of 
a particular composer by recombining atomized musical quotationsderived from that composer's 
works. But whereasthecomposers of Musikalische Würfelspiel had to compose theirown atom¬ 
ized musical tables, Cope's EM I system generates the musical tables that are the basisof the new 
works to be composed by analyzing the target composer's musical Corpus under the direction of 
atrained operator. 

EM I performs its analysis using techniques drawn from natural language Processing, aug- 
mented transition networks, and other techniques drawn from Al tosynthesizenew compositions. 
Like M arkov and connectionist approaches, EM I recomposes the music of the target composer. 
The result is highly original (though not always very artful) music with thestylistic signature— in 
both its surface and deep structure- of the identified composer. 

M usic expresses its own essential nature much the way that organisms are expressions of their 
genes. If we can identify thegenetic basis (so to speak) of a composer's style in a sufficiently for¬ 
mal way, weshould be able to use itto create original compositions in that style. This issuespeaks 
to the desi re Schillinger first expressed that theories of art should be generative, not merely ana- 
lytical (seesection 9.11.2). 

C ope’s aestheti c premi se i s that new musi c i n the target composer's sty I e can be created through 
recombinant techniques. EM I is an analysis/synthesis system that creates a database of musical 
elements by analyzing a composer's works and then interpolates among them in various ways to 
realize new works. In outline, the method is asfollows: 

1. T he user must sel ect and encode a Corpus of musi cal works from the target composer i nto a for¬ 
ín at that EM I can digest. To facilítate pattern recognition, Cope suggests, the selected works 



C omposition and Methodology 


should berelatively homogeneous, similar in overall structure, range, and orchestration. Forexam- 
ple, Cope(1999) used Mozart's middlesymphonies (numbers6-31) asan analysissetfora new 
symphony in M ozart's style. He used a similar approach for a new piano concertó. Both works 
have been recorded and are available commercially. This is the same initial step that must be 
employed by any system that learns from a Corpus of examples. Clearly, the operator's selections 
havedramatic impacton EM l’ssubsequent steps. 

2. EM I performs a lexical analysis based generally on Noam Chomsky'stheoriesof the structure 
of natural languages, and ahierarchical temporal and harmonio analysis of the works based on the 
¡deas of Heinrich Schenker (1935). 

3. EM I identifies what Cope calis signatures, which are unique characteristics of the com- 
poser’s style, using pattern recognition techniques adapted from natural language Processing. 
The analysis ostensi bly contai ns no a priori notion asto the si gnatures to befound, so thetech- 
niquecan presumably be applied to music of any style. But in factthere is great latitude in this 
step for the EM I operatorio refi ne the processof signatura sel ection based on the operator’s prior 
experience with the composer's style. Cope has reported, for example, that in the case of the 
M ozartian symphony and piano concertó, he took great pai ns to tune E MI ’s analysis parameters 
to greatest advantage. 

4. Driven by this analysis, EM I then breaks the musí cal corpus i nto its fundamental components, 
which are now ready to be recombined. 

5. EM I uses augmented transítion networks driven by a random process, and makes refinements 
by pattern matching based on extracted stylistic features, to recombine the music ¡nto an original 
that preserves the composer's signature style. Thecreativeaspectof EM I reflectsmany of the tech¬ 
niques described in this chapter. It can provi de vari ation by i nterposi ng similar but distinct ele- 
ments from the analysis. The recombination can take place on several levels of musical scope 
because the hierarchical analysis provides compositional rules for thematic, middle-ground, and 
large-scal estructures. 

6. The original work is then formatted for representad on in common music notation to be played 
by traditional instrumentsor converted to aformatsuch as MIDI so it can besynthesized. 

Inthehandsof askilled operator, EM I can producea believablefacsimile of a composer's style 
from a careful ly selected corpus of the target composer’s works. C ope has also publ ished examples 
in the style of J. S. Bach, Frédéric Chopin, and Scott J oplin, among others. 

Cope's EM I system is perhapsthe mostadvanced automated composition system extanttoday 
and therefore can serve as a good target for analysis and criticism. In thefollowing sections, I dis- 
cuss some of the important questions raised by his work. 

9.24.1 IsEMI Experimental? 

Insofar as Cope defines his system as "Experiments in M usical IntelIigence," it isfair to ask if 
Cope's system istruely an experimental method. 



402 


C hapter 9 


There is actually a seventh step in the EMI process, not an official part of Cope's System but 
obviously of crucial importance: editing. The operator selects among the generated compositions 
forthosegood enough to be played in public. Editing is a necessary step because there would be 
little audience for every one of the possi ble compositions such a system can create, just as there 
would have been little audience for what Beethoven threw in histrash basket. 

Although the quality of EM l's compositions and their entertainment valué is crucial, these are 
nottheir most important credentials: theselected compositions, especially theones created by the 
author of the method, become official speci mens, proof of the effectiveness of the method that cre¬ 
ated them, B ut there i s a contradi ction between hand-selecti ng the most convi nci ng examples and 
the requirements of a true experimental method. 

The experimental method in Science is about testing suspected explanations regarding one's 
observations. To get at the truth, one conducts experiments that must be carefully constructed to 
avoid hidden biases and confounding factors, undetectable things that might influence the results 
and lead the research astray. The most common confoundi ng factor is puré chance, where through 
luck one happens upon a population of specimens that erroneously validates or invalidates the 
hypothesis. 

Because of the danger of confounding factors in experimental design, the scientific method 
requi res that conclusions must not be based merely onanecdotal evidence such as a single instance 
or a very I i mi ted group of speci mens or subj ects. R egardl ess of the hy pothesi s C ope i s expressi ng, 
and though the published examples of his EM I work are superlative, his specimens constitute 
a very limited group and therefore must be considered only as anecdotal evidence of EM l's 
effectiveness. 

But EM I should not besingled outhere; this criticism can al so be di rected at every method dis- 
cussed i n this chapter. H i I ler and I saacson (1959), the other composers to use the word experiment 
i n the ñame of their composing system, werethe firstto face this problem. In creating their llliac 
Suite, they similarly had to generate official specimens that proved the effectiveness their meth- 
odology. In their book Experimental Music, they claim that they used no preferential criterion to 
select example outputs from their program for inclusión in the llliac Suite so as not to color the 
results. Although their approach helps, it does not prevent their single published composition 
(the llliac Suite) from being anecdotal evidence. 

U se of the terms experiment and experimental in the arts is difficult because the terms are so 
freighted with scientific meaning. However, there is a constellation of related words, such as 
experience and experiential, that share the same root and that capture the i mportanceof personal 
observation to both the arts and the Sciences. If we think of experiment as meaning qualified 
experience, I think we get closerto how artists think about being experimental in their art. Artistic 
experiment is about considering novel or unusual combinations of elements for the purpose of 
i ncreasi ng the horizons of surprisal. I spent a number of years worki ng at the C enter for M usic 
Experiment at the University of California, San Diego, and this pretty well characterizes what 
wenton there. 



C omposition and Methodology 


403 


9.24.2 IsEMI Intelligent? 

When I heard EM l’s M ozartian symphony and piano concertó for thefirsttime, I definitely had 
the experience of listening to M ozart, or perhaps a good M ozart imitator. To have captured 
M ozart'sstyleso aptly is such astunning technical and aesthetic accomplishmentthat it warrants 
asking, IsEMI intelligent? 

Turing (1950) suggested thatif wecan'tdistinguish between an i ntel ligent person’schoices and 
acomputer'schoices, then itis reasonableto say that the machi neisbehaving i ntel ligently. Insofar 
as Cope defi nes his System as"Experi ments i n M usical I ntelI igence," it isfair to ask if Cope's Sys¬ 
tem passes a musical equivalent of Tu ring’ s test. 

9.24.3 Aural Sensibility 

H i 11 er and I saacson (1959) attempted to di rectly encode rul es of compositi on i nto thei r system, but 
they realized that there were I imits to whatthey could accomplishjustwith the use of rules. Com¬ 
posi ng isaboutmorethan following rules. They wrote, "The composer is traditionally thoughtof 
asguided in his choices not only by certain technical rules but also by his'aural sensibility/ while 
the Computer would be dependent entirely upon a rationalization and codification of this 'aural 
sensibility.'" 

Other Systems, such as DEC and neural networks, allow a composer's "aural sensibility" to 
emerge from experience by inference and general ization, emulating human learning. However, 
EM I,arguably moresuccessful todatethan theseotherapproaches, actually reliesmuch lessthan 
they do on reasoning and inference. Instead, ittakes abrute forcé approach. By analyzing alarge 
Corpus of works, EM l’s analysis phase attempts to provide its synthesis phase with a rich set of 
options for everychoi ce faced by the target composer. EM I' s anal y si s database i s essenti al I y a com- 
pendium of answsers to the question, What would M ozart have done in this situation? 

This is similar to the approach presumably taken by IBM Corporation's chess-playing program 
Deep Blue, which managed to defeatchess master Garry Kasparov in 1997. Itseemsthat a large 
catalog of chess moves was created by analyzing many games of chess masters. During a game, 
the program would movepiecesbasedoncontext, sel ecting from among the moves that were made 
in similar situations by the masters the program was emulating. 

This approach is mostly aboutmodeling the choices al ready figured outby masters and actu¬ 
ally requires little learning or reasoning about musicor chess. What isneeded i s a real ly big and 
realIy fast database and a sophisticated search capabiIity. And although human composers and 
chess players also learn by example, we are not wi red to perform by exhaustive search. 

Does this disparity in method disqual ify EM I or Deep Blue from being cal led i ntel ligent? Can 
a system be i ntel I i gent only if its methods are fashi oned after our own? Turi ng urged us not to focus 
on the process but on the result. H e was Iess concerned about i mplicituse of reasoni ng and learning 
than explicit behavior: Does a Computer seem i ntel ligent? If so, then it is! J ust aswecan admire 
a beautiful sunsetwithoutworrying about how it was created, weshould certainly be able to enjoy 
a composition that pleases us without cari ng who, how, or what created it. I suspectTuri ng would 
urge usto quit worrying about intelligence and to relax and enjoy the music. 



404 


C hapter 9 


9.24.4 A M usical IntelligenceTest 

Cope (1999) comments in the liner notes of his álbum Virtual Mozart that he avoided analyzing 
M ozart's symphonies beyond #31 "because they are so well known that derivations would have 
been recognizable." But hegoes on to say, "Theresultant work does show infIuencesof theseIater 
symphonies, however." 

Itmakes sensethat M ozart’s later symphonies were i nformed by hisearlierones. B ut has Cope's 
EMI system managed to do the same thing? Has EMI identified and developed someelementsof 
M ozart's earlier symphonies into a morematurestyle? If so, that would seem to beevidencethat 
M ozart's aural sensibility lies within his music where EM I can access it, and that EM I is ableto 
find it, extract it, mature it, and use it as the basis of new compositions. Given what we know of 
EM l'sprocess, thisseems implausable. Butaccording toTuring, weareto considerthesystem's 
behavior, not its inner workings, when deciding about intelligence. And so, like a good jury, let 
usfollow thejudge’s orders, at leastfor now. 

Cope’s observation that EM I evidently developed some elements of M ozart's mature style is 
only anecdotal and subject to interpretation. How could we prove or disprove that EM I (or any 
other system) can actto develop a more mature style from a less mature one? 

Turing's test methodology allows the experimenter to ask any question or pose any problem that 
would hel p prove or disprove the i ntel I igence of the system under test. B ut it’s hard to ask questions of 
a musi cal score. 11 i s easi er to el ai m that a chess program that beats a master i s i ntel I i gent because there 
is a clear criterion: winning. The arts are more ambiguous. 

But music can beanalyzed. Forexample, supposewe conducta test of i ntel I i gence on E M I such 
asthefollowing. We begin with two contrasting premises: 

■ If a composi ng automaton is driven by a random process, we should be able to identify the"wan- 
dering" quality in its outputthat was noted by Pierce (1983). 

■ I f a composer’s works are i nformed by hi s or her pri or works and the reí ated works of others, then 
the same should be true of an artificial composer's works. 

Thissuggestsan experiment, asfollows. Let E 1 stand fortheM ozartian symphony created by EM I 
for the álbum Virtual Mozart. Lettheway EM I composed E 7 bethefunction f of the set of M ozart 
symphonies #6 through #31 asfollows: E 1 = f(M 6 ,M 7 ,M 8 ,..., M 31 ). Now supposewe com¬ 
pose a set ofN additional M ozartian symphonies using the same EMI technique: 

E 2 = f(M 7 , M 8 .M 31 ,E 1 ) 

E 3 = f(M 8 . M 31 ,E,,E 2 ) 

En = f(Ei> E 2 i ■ ■ ■, E N _ i). 

In otherwords, each new EM I M ozartian symphony replaces theearliestM ozart symphony i n the data- 
base until all of M ozart's symphonies are eventually replaced by new EM I M ozartian symphonies. 





C omposition and Methodology 


As EMI symphonies progressively replace M ozart's in thedatabase used to derivesubsequent 
symphonies, how does EM l’s musical styleevolve? Doesthe progression of symphonies: 

■ Present a coherent, self-consistent set of works, as do M ozart's? 

■ Develop recognizable signatures of M ozart’s mature style? 

■ Suggest how M ozart might have developed with his symphonies had he lived longer? 

If any answer isyes, that would bestrong evidencethat EM I has successfully encoded M ozart's 
aural sensibility. 

Does the progression of symphonies drift off in some other direction that is not M ozartian? Do 
subsequent EM I compositions become less musically interesting as the database is progressively 
leftto itsown devices? If either answer isyes, then perhaps EM I has not encoded M ozart’s aural 
sensibility. 

Of coursethere is al so the possi bility that EM I remains stylisti cally stagnant, continuing to 
chum out endless mi ñor variations on M ozart's symphonies 6-31. 

Cope reportedly conducted an experiment like this based on three works by Igor Stravinsky 
(Holmes 1997). From ti meto time, hemixed in thework of another contemporary of Stravinsky's 
to model theway human composersareinfluenced by thei r peers. Cope reported that over the course 
of time EM I developed the style of a mid-twentieth-century Russian-American composer. This 
effortofCope'sseemsalotclosertothetruemeaningofexper/ment.Thereisahypothesis.amethod, 
andmostimportant, repeatabi I i ty. O thers coul d conduct thi s experi ment and the resu I ts cou I d be su b- 
jectto peerreview, all importantaspectsofthescientific method. Itwould be particularly interesting 
to know if stylistic stagnation resulted if thetargeted style were notmixed with others. This might 
open up an understanding of the interaction of personal creativity and social forces. 

9.24.5 Taste, Goodness, and Design 

If we stick to Turing’s behavioristapproach to measuring intelligence, then I think this member 
of the jury would havetofind EM I guilty of intelligence. Butl believewealso havean obligation 
to reflect upon the model whereby EM I creates ¡ts music and compare that to the human process 
as bestwecan. And when wedo, I bel i eve the jury is still out. 

Consider the fact that EM I requires a database of preexisting works. When examined from a 
functional perspective, its intelligence, like that of every other composing system discussed in 
this chapter, is derivative of the music it emulates, derivative of the musical experi ence of its 
operator in fine-tuning its analysis, and derivative of the knowledge of EM l's creator. How 
could itbeotherwise? 

In contrast, M ozart had no Corpus of examples of his personal style to draw upon when, asa 
young child, he began to compose in his highly recognizable signature style. Of course he was 
profoundly influenced by his teachers and by the music around him, but the origination of his 
personal style was seemingly guided primarily by his superlative taste, which had no external 
referent because, manifestly, no one el se ever did manage to compose like he did. The same 



406 


C hapter 9 


could be said of Beethoven, Schoenberg, and Stravinsky, for example. After hearing the child 
genius M ozart perform his own compositions, the great composer Joseph Haydn is known to 
have remarked to M ozart's father, Leopold M ozart, "Before God, and as an honest man, I tell 
you that your son is the greatest composer known to me either i n person or by ñame. H e has taste 
and, what is more, the most profound knowledge of composition." 22 

And of course, M ozart's ultímate signature is the music he went on to create throughout his 
careen W hile the si gnatureof his musical styleremained relatively continuous throughout his I ife, 
the compositional visions he brought to life seemed discontinuously to spring full-blown, like 
Athena from the head of Zeus. The art of his mature works seemed to have no antecedent. This 
characteristic of an artist’s work is one of the indicators that predicts enduring fame. 

Haydn's taste, Knuth'sgoodness (seesection 9.2.1), and Hillerand Isaacson'saural sensibility 
are al I key aspects of design. F or example, there are an i nfi ni te number of ways to fi nd the greatest 
common divisor of two integers (including guessing), but wechooseEuclid’s method (see sec¬ 
tion 9.2.2) because its design appeals to us— it shows goodness. Design is a key underpinning of 
mathematicsaswell: "M athemati es are the resul t of mysteriouspowerswhich nooneunderstands 
and in which the unconscious recognition of beauty must play an important part. Outof an infinity 
of designs a mathematician chooses one pattern for beauty’s sake and pulís it down to earth" 
(M orse 1959). 

At bottom, whether designing music or mathematics, we reach into ourselves and extract that 
which most agrees with our natures and the problem that we pose to be solved. This is the funda¬ 
mental processof art. AII else is imitation. 

9.25 Calculating Beauty 

Hermann Helmholtz (1863) wrote, 

To furnish a satisfactory foundation for the elementary rules of musical composition ... we tread on new 
ground, which is no longer subject to physical laws alone.... Henee itfollows— that the system ofScales, 
Modes, and Harmonic Tissues does notrestsolely upon inalterable natural laws, butis also, atleastpartly, 
the result of aesthetical principies, which have already changed, and will still further change, with the Pro¬ 
gressive developmentof humanity. (250-251; italics in original) 

From ancienttimes, wehavesoughta rational explanation of naturethrough scientific enquiry. 
Since art was considered an imitation of nature, Science also studied art. The Pythagoreans were 
perhaps the first to identify a connection between aesthetics and mathematics: beauty was found 
to resideincertain mathematical divisions of a vibrad ng stri ng, and notin others. Thedistinguish- 
ing characteristic seemed to betheharmonious— beautiful— proportionsof the división. Thusthe 
Pythagoreans discovered what they believed was a way to study beauty objectively, to quantify 
beauty by simple integer ratios. Thus beauty could be found in the proportions of a stri ng or in 
the proportions of the whole universe. This discovery powered aesthetic research and debate for 
centuries. 



C omposition and Methodology 


ThePythagorean observationof beauty in ratios carneto bestudied under the ñame eurythm/'cs, 
thestudyof harmony and proportion. Asatheoretical device, itcouldbeusedtocreateandanalyze 
all forms of art: dance, architecture, music, sculpture, painting, and so on. From antiquity and 
through the M iddle Ages, the Sciences of subjective and objective naturethrived together, united 
as the branches of the quadrivium. "M athematical Science ... has these divisions: arithmetic, 
music, geometry, astronomy. Arithmetic is the discipline of absolute numerable quantity. M usic 
is the discipline which treats of numbers in their relation to those things which are found in 
sound." 23 

However, the quadrivium fell apart in the Renaissance. Natural scientists and mathematicians 
became i ncreasi ngly uninterested in the arts because— despite the Pythagorean premise— no the- 
ory had successfully provided a rational link between aesthetics and proportion. The Science of 
aesthetics fell by the wayside and was deemed unscientific (James 1993). The gulf between the 
arts and Sciences has continued to this day. Some artists rail against reductionistic explanadons of 
creativity. Somescientistsquestion whether aesthetic experimentad on can be scientific. Butmeth- 
odological analysis reveals that the disciplines of art and Science arecutfrom the samedoth. Com- 
paring Euclid’s method with Guido’s method, wesaw thatthey are distinguished only by the role 
of subjective choice— of nondeterminism— in art. Art is not Science, but their methods are more 
alike than different. 

W hen applied to music, methodological criticism goes quickly to the core of the artist’s inten- 
tion,allowingustoapprehendthedeepersignificanceof theirart. Guidod’Arezzo’smethod shows 
his concern that music should be subordínate to thesacred Latin text. Schoenberg’s combinatorio 
methods show a desi reto deconstruct conventional harmonio expectation. Schillinger’s methods 
followed from his desi re to develop generad ve theories of art. X enakis’ statistical methods reflect 
his view of the quantum nature of sound. Cage’s chance methods serve his aim to deconstruct the 
expectation of expectation. 

M ethodological criticism, information theory, psychoacoustics, complexity theory, and other 
approachesdiscussed in this chapter are making importantcontributionsto theoretical aesthetics that 
finally allow the dialogueabout the nature of art to move beyond its fixation with Pythagorean pro- 
portionality. Perhaps thetruest proportions in music are those that relate expectation, interest, 
entropy, and redundancy; perhaps the trueststudy of music structure requi res understanding thenon- 
linearities of our perceptual and nervous Systems as well astheself-organizing principies of nature. 




Appendix 


The wondrous potency of music, which moves the world and compels the spirit, captured in the net of 
numbers. 

— Walter Burkert, Lore and Science inAncient Pythagoreanism 

A.l Exponents 

If p and q are any real numbers, a and b are positive real numbers, m and n are positive integers, 
aPa<?=aP +í ) |?=aP-P (aP)P=aPP (a m ) 1/n = a m/n 

1 ía\ lln fll/n 

a ' P = Tp l b) = W° a ° = 1 if a *° ( a W= aPbP 


A. 2 Logarithms 

If ap = x wherea is not 0 or 1, then p is called the logarithm of x to the base a. If x, y, and a are 
real numbers, then by this definition and the rules for exponents, we can write 

log a xy = log a x + log a y 
log a (x/y) = log a x- log a y 
i°g a x y =y i°g a x 

The i rrational number e is cal led the natural base of the logarithms, and log e x is also written as 
Inx. When written without specifying a base, log implies basee, that is, Inx = logx = log e x. The 

valué of e is ¡rrational; its first few digits are 2.718281828_ 

To change the base of a logarithm, use the formula log a x = log^x/log^a. 



410 


Appendix A 


A.3 Series and Summations 

A series is any summation of a repeating pattern of terms. An example of an arithmetic series is 
2 + 4 + 6 + 8+ -".Each subsequentterm iscomputed by adding orsubtracting aconstantamount 
to the immediately preceding term. 

A simplegeometric seriesmightbe2 + 4 + 8 +16 + ■ ■ ■. Each subsequentterm iscomputed by 
multiplying or dividing the immediately preceding term by a constant amount. 

M athematicians have developed a useful shorthand for representing series, sigma notation. For 
example, the equation 

s = ¿ (5 -n) 

is shorthand for the equivalent expression 
s = (5 1) + (5-2) + (5 -3) + (5 - 4). 

Wecan useitto form thesum of arithmetic and geometric series. The Symbol X, theGreek char- 
acter si gma, is used by mathemati ci ans to represent the sum of a sequence of terms. The expression 
to the right of the sigma, (5 • n), is the summand. The numbers below and above the sigma are the 
limitsof summation, and the variable n is the Índex. 

This example, 

s (t) = £5nt, (A.l) 

can be written equivalently as 

s(t) = (5 ■ lt) + (5 • 2 1) + (5 • 3 1) + (5 • 4t). (A .2) 

This isthe expansión of (A. 1). E very poi nt t of thefunction s is descri bed by the entire summation. 
The examples above are finite series because the sequences of terms are finite. In the case of an 
infinite sequence of terms, 


the correspondí ng infinite seri es is 



The nth term x n of a series is the general term. A n infinite series is convergent if its valué tends 
toward a finite sum, otherwise it is divergent. 



AppendixA 


411 


A.4 AboutTrigonometry 

Trigonometry is the study of trigons, which are triangles, especially right triangles, that are 
inseríbed within acircle(figureA.1). 

Theratioof thediameterto the circumference of a circle is the irrational numbern = 3.14.... 
Because the radius is half the length of the di ameter, the circumference is 2 n times the radius. 

Angles are commonly measured in degrees, and the angle corresponding to a complete rotation 
is 360°. There are 2 n radians or 360° in a circle (see section 5.2.2). A n angle can be measured either 
clockwiseorcounterclockwisefrom astarting point. Conventionally, positive angles are measured 
counterclockwisefromthepositive horizontal x-axisof the circle, and negative angles are measured 
clockwise. Thus, for example, if we picture a circle on a blackboard, an angle of 0 o conventionally 
poi nts to the right al ong the positive horizontal axis; 90° pointsstraightup;-90° poi nts strai ght dow n; 
and 270°= -90°. 

A.4.1 SineRelation 

Suppose we constructed atriangle liketheone in figureA .1 so that the length ofe (which isboth 
thehypotenuseof the triangle and the radius of the circle) isfixed, butsidesa and b are elastic: 
they can grow and shrink. Also suppose that point p is ableto move around the circle, and that 
pointq is constrained to follow itsuch that the angle Oqp isalways a right angle. Last, theinner 
apex of the tri angle is always at the center of the circle. These rules basically mean that we are 
limited to right triangles inscribed in a circle with the triangle's base resting on the horizontal 
axis. AstheangleGincreasesand pointp movescounterclockw i se around the circle, the tri angle 
changesshapein acharacteristic way (figure A.2). If we study theway in which theratio of ble 
changes as the angle 0 changes, we observe that this relation correspondsto sinusoidal motion. 



FigureA.1 

Righttriangle inscribed in a circle. 



412 


Appendix A 



Figure A.2 

Family of triangles inscribed in a circle. 

That is, the radius c, its angle to the horizontal axis 0, and the ratio ble are connected to each 
other by the sine relation: 

sine=k. Sine Relation (A .3) 

Consider, for example, when 0 =0. Thenc IÍesalong thepositivehorizontal axis,a =c in length, 
and b = 0, since the "triangle" has no height. Henee sin 0 =0/c =0. When 0 =90°, c lies along 
the positive vertical axis, ¿> =canda =0. Henee sin 90° = 1/1 =1. Seesection 5.4 for more detai Is. 

A.4.2 CosineRelation 

Considerthe relation between sidesa and c in figure A.1. The angle 0 and the ratio ale are con¬ 
nected to each other by the cosine relation: 

cos0=|, Cosine Relation (A A) 

which issimilarto the sine reí ationexcept that its valúes are shiftedby 90°, that is, cos0 = sin (0+90°). 
This makes sense because the cosine involves the ratio ale instead of ble, and side a is orthogonal 
(that is, at a 90° angle) to side b; henee it precedes the sine wave by 90°. 

A.4.3 TangentRelation 

Last, consider the ratio bla, the ratio of the two elastic sides of the triangle in figure A .1. The angle 0 
and the ratio bla are connected to each other by the tangent relation: 

tan0 = |. Tangent Relation (A.5) 

When 0 =45°, the triangle is an isósceles right triangle and a = b, so tan 45° = 1. When 0 =0°, 
tan 0=0 la =0. Butwhen 0 =90°, tan0 = bl 0 =°°. 



AppendixA 


413 


A.4.4 RelatingTangent, Sine, and Cosine 

Wecan relatethese definitions to each other as follows: 

tan0 = ^, sine=j, cos0 = |, | 

.-. tan0=-^^. Relatíon ofTangent, Sine, and Cosine (A .6) 

COS0 

A.4.5 Reciprocal Trigonometric Functions 

Weform the reciprocáis for sine, cosine, and tangent by reversing theorder of their ratios. Each 
of these reciprocáis has its own ñame: 

cot0 = -i- = - , Cotangent (A .7) 

tan0 b 

sec0 = —— = £, Secant (A .8) 

cos0 a 

csc0 = -i- = £ . Cosecant (A .9) 

sin0 b 

A.4.6 Inverse Trigonometric Relations 

The trigonometric functions determine the angle of the hypotenuse 0 from the sides a, b, and c. 
B ut what if we know the angle 0 and want to use it to find the proportions of the triangle? 

T he i nverse tri gonometri c f uncti ons determi ne the rati o of the si des from the angl e of the hypot¬ 
enuse againstthepositive horizontal axis. For instance, theinverseof sin 0 isarcsinex, also written 
asinx or sin- 1 /, wherex is a ratio of two sides. The cosine and tangent functions are similarly 
named, for example, arctanx = atanx = tan- 1 /. 

Buthow do we define these functions? Atfirstwe might think, just inscribe a trianglein a circle 
with angle©, measure its sides, then find their ratio. Butthereisaproblem: becausewearemeasuring 
angles on a circle, there are actually many angles— infinitely many at múltiples of 360 o —that 
correspond to any particular proportion of sides. For example, if x = bla = 1/1, then the tri angl e is an 
isósceles righttriangle and atanx = 45°, butitisalso true that atan x =45° ±(k- 360°), wherekisan 
integer. So the inverse trigonometric functions are ambiguous. 

But, i n general, al I we usual lywantis the angle whenk=0. So wedefinearangeof pr/nc/pa/va/ues 
that covers just these angles. The principal valúes of the arctangent, arccosine, and arcsine are 

-90° < tan 1 /<90° 

0 < cos _1 x < 180° 

-90° < si n 1 / < 90°. 



414 


Appendix A 


A.5 Xeno'sParadox 

Xeno of Elea reasoned that if onetravels distance d from point A to point B, one must certainly 
travel half the distance d/2 to point 6 beforetraveling the whole distanced. And from that point 
one must again travel half the remaining distance d/4, and so on. Continuing in this way, he 
reasoned, one would never reach point B because one must pass through an infinite number of 
poi nts and that i s i mpossi bl e i n a f i ni te ti me. 

Essentially, his argument is that if space and time are composed of indivisible poi nts and 
moments, these must havesome magnitude, and wearefaced with the contradiction of a magnitude 
that cannot be divided. If space and time are divisible ad infinitum, we are faced with the contra- 
diction thatan infinite number of poi nts and moments can be added uptomakeamerelyfinitesum. 
X eno's point is thatsi nce multi plicity and motion contai n these contradi cti ons, they cannot be real. 
Therefore, as histeacher Parmenides said, there is only one Being, with no multi plicity, excluding 
all motion and change. 

This is a perfectly fine outcome if you are satisfied with it. If you are not, then a modern way 
out of this problem is to consider space and time not as a densely packed infinity of poi nts or 
moments, butas sparsely packed such that no point is nextto any other point. Thus, between any 
two poi nts or moments there is al ways a thi rd regardless of scale. T he advantages of thi s approach 
aretwofold. First, the nondenumerable infinity of real numbers(and likewiseof points in space 
and of events i n ti me) i s therefore much largerthan the mere denumerable infinity of integersthat 
X eno envisioned. F urther, the sum of an i nfinite series of real numbers can have a fi nite sum. This 
latter poi nt is the el incher. 

A.6 Modulo A rithmetic and Congruence 

If it's 1:30, and your friend says she'll meet you in 45 minutes, what time will it be when she 
joinsyou? If you answered 2:15, then you used modulo arithmeticto obtain theanswer. Since 
there are 60 minutes in the hour, time-based calculations must keep the number of minutes in 
that range. 

We could formalize the example this way. Using "minute arithmetic," we could write 
75 = ((15)) 60 , or 75 = 15 mod 60, expressed as "75 is congruent to 15, modulo 60." 

In general, if the difference between two integersr and b can be divided without remainder by 
another number m, then r and b are congruent modulo m. This is written as 

r = ((b)) m if (b-r)/m i san i nteger. (A.10) 

In the example, the quotientof (75 - 15)/60 ¡san i nteger, so 75 and 15 are congruent modulo 60. 

A common useof modulo arithmetic is to obtain the remainder of ¡nteger división. The valué 
b isthebaseandr istheremainder.TheFoRTRAN programming languageprovidesaway to obtain 
the remainder of two numbers with the function mod(b, m) ; the C and C++ programming 



AppendixA 


415 


languagesdefineitwith a binary infix operator %. M usimat, theprogramming languageinvented 
forthis book (seeappendix B), definestheremaindering operation asfollows: 

Inteqer Mod(Integer b, Integer m){ 

While (b >= m) {b = b --tiítf} 

While(b <= -m){b = b + m;} 

Return(b); 


Note that Mod() can opérate on and return negative valúes. Forexample, ((-l))i 0 = -l and 
((11)). 10 = 1. In general, the return valué will be 

-m<n<m for ((n)) m . 

There are times when it would be convenient to forcé the remainder r to be a positive modulus 
number even if the base b is negative. For example, in M usimat, the Índex of a List must be 
a positive integer. So M usimat has a versión of Modo that returns only the positive wing of 
modulo valúes: 

Integer PosMod(Integer b, Integer m){ 

While (b >= m) {b = b - m; } 

While (b < 0) {b = b + ir.; J 
Return (b) ; 


Forexample, Print (Mod(-13, io>); print5-3,whereasprint (PosMod(-13, io)),- prints 7 
(see figure A .3). 

Both Mod () and PosMod () preserve the position within the modul us i nterval, but PosMod () 
also requires the valué to be positive. If we have a List of ten elements numbered 0 to 9, we can 
provideabaseofany positiveornegativevaluebtoposModtb, ¿0) and itwiII coercetheremainder 
to lie within thevalid rangeof the List. 



Figure A.3 

Signed and unsigned modulus operation. 



416 


Appendix A 


A.7 Whenee0.161 in Sabine'sEquation? 


Wallace Sabine derived the constant 0.161 in his equation for reverberaron time in a room and 
verified it both experimental!y and theoretically (see equation 7.31). His derivation provides a 
fundamental view of statistical room acoustics. 

I n physics the mean free path is the average distance a partióle can move i n a gas without a col- 
lision. Sabine (1921) adapted the term to mean the average distance a wave front can travel in a 
room before bei ng ref I ected by a wal I. H e demonstrated that an approxi mate val ue for the acousti c 
mean free path is 

MFP = y =d, Acoustic Mean Free Path (A.11) 

where V is volume and S is surface area of the room. This approximation assumes that the sound 
field in the room is diffuse, that is, theenergy density is uniformly distributed throughoutthe room 
(homogeneous) and istraveling with equal intensity in every direction (/sotrop/c).This ideal con- 
dition can only be approximated in real roomsbecausedifferentareasof a hall will havedifferent 
amounts of absorption, dependí ng upon the hall’s geometry and absorptive properties. 

The average number of reflections per second f R is thespeed of sound c divided by the M FP: 


The average time between reflections is the inverse: 



c 


4 V 
5c 


Although ai rabsorbssome sound, much more sound isabsorbed by wal Is duri ng reflections. So 
a room with smaller A t will havea shorter reverberadon timethan a room with greater At. 

Intuitively, when the sound source is first turned on in a room, it pumps acoustic energy into 
the room, and though the walls suck energy out, they don't remove it all. The energy remaining 
i n the room i ncreases its total energy, and wehearan exponenti al buildupof sound level over time. 
W hen the rate of energy entering the room equals the rate of energy leaving the room, equilibrium 
is achieved, and the energy density plateaus. 

Energy density in a reverberant room can be likened to the mean water level in a leaky water 
tank: thelevel inthetank isproportional totheinputrateofflow and inversely proportional tothe 
output rate of flow. If the input rate of flow goes to zero, water will drain out and the rate of water 
I oss w i 11 be proporti onal to the remai ni ng water I evel. S i mi I arl y, i n a room, the rate of energy I oss 
is proportional to the remaining energy density. 

Let the energy density in a room attimetbedenoted by the function l/l/(t). (Calculus alert!) By 
defi nition, theenergy rateofchangeisdl/l//dt. Wecanexpress the observad on that the rateof energy 
change is proportional to the remaining energy by writing 



AppendixA 


417 


f=¡kW, (A.12) 

dt 

wherek is a constantof proportionality that can be determined for particular rooms. The constant k 
specifies the steepness of the slope. For example, a highly absorptive room such as an anechoic 
chamberwould havea relatively large valueof k,whereasaroom with reflectivestone walls would 
haveasmall valué because its reverberation decays much more slowly. Equation (A.12) isa 
first-order differential equation. The requisite mathematical equipmentto solveitis presented in 
volume2, chapter 6. 

Let's assume a trial solution of the form 

(A.13) 

If we substitute this definition of the function 1/1/ back into (A .12), we get (by the power rule) the 
general solution 

dW =kW = dp-tt = ke- kt . (A.14) 

at at 

W hen weswitch off theinput power to measurethe reverberation time, let's say the total energy 
density intheroom isinitially 1/1/(0) = l/l/ 0 . A Iso setk = 1/x, wherex i s the ti me constantof the expo- 
nential curve. Then wecan writethe particular solution as 

W(t) =W 0 e~ tlz , (A.15) 

which says that the power 1/1/ in the room is a decaying exponential function of time f with time 
constant x. 

The reverberation time T fi corresponds to the length of time it takes for the sound to become 
inaudible, defined as the time required for the sound to decay by 60 dB SIL, that i s, to a millionth 
of its original intensity. Wewanttoknow thetimeT fi requi red for energy todropby a factor of 10 -6 
in a particular room, that is, we want a solution to the equation 

10" 6 M/ 0 = W 0 e" (Tft/T) . 

Sol vi ng forT fi , weobtain 

T r = log(10 6 )x, or T fi = 13.8x. 


Recall the definition of theaveragetime between reflectionsAtfrom (A.11). Settingx =At, we 
obtain 


Using a valueof c =342 m/s forthespeed of sound, and combining constants, we haveSabine's 
equation: 



418 


Appendix A 


Tr 


= 0.16l| 


(A.16) 


I n the I i terature, the constant ranges from 0.160 to 0.164, dependí ng upon the speed of sound used. 

So isthis the right constant? I sthis theright equation for thejob? G iven thenumber of alternative 
formulations in the literature over the last hundred years, it would seem that Sabine's remarkable 
achievement is not without flaws. Even if we correct for air absorption, his formula tends to estí¬ 
mate reverberaron times that vary widely from actual results. A Iso, its statistical nature provides 
no way to adjust it for environments that are not ideal ly diffuse. 

Another problem is that as average absorption a approaches 1, Sabine's equation does not pre- 
dict that reverberation time goes to 0, even though in an anechoic chamber it effectively does. 
Eyring (1933) proposed a reverberaron formula in which the absorption coefficient iscalculated 
according to a £ = -1n(1 - a), so that (A.16) becomes 


Tr 


= 0.161 


V 

-S In(1 - a)' 


(A.17) 


which gives a reverberaron time of 0 when á = 1, and reduces to Sabine's formula when a <§ 1. 
M any other refinements and alternatives are now available. After a century, reverberaron time is 
still the subject of active research. 


A.8 Excerptsfrom PopeJ ohn XXM'sBull RegardingChurch M usic 

The competent authority of the Fathers has decreed that, in singing the offices of divine praise through which 
weexpress the homagedueto God, we must be careful to avoid doing violenceto the words, butmustsing 
with modesty and gravity, melodies of a calm and peaceful character.... But certain exponents of a new 
school, who think only of the laws of measured time, arecomposing new melodies of their own creation with 
a new system of notes, and these they prefer to the ancient, traditional music.... By some, their melodies 
are broken up by hocheti orrobbedof their vi rilityby discanti, tripla, motectus, with a dangerous el ementpro- 
duced by certai n parts sung on texts i n the vernacul ar.... T he mere number of the notes, i n these compositions, 
conceal from us the plainchant melody, with its simple, well-regulated rises and falls which indícate the 
character of the M ode. These musicians... intoxícate the ear without satisfying it, they dramatize the text 
with gestures and, instead of promoting devotion, they prevent it by creating a sensuous and innocent 
atmosphere.... We are prepared to take effective action to prohibit, cast out, and banish such things from the 
Church of God. Therefore,.. . We prohibit absolutely, for the future that anyone should do such things, or 
othersof Iike nature, during the Divine Office orduring theHoly Sacrificeof the M ass.... However, Wedo 
not intend to forbid the occasional use... of certain consonant intervals superposed upon the simple ecde- 
siastical chant, provided these harmonies are in the spirit and character of the melodies themselves, as, for 
i nstance, the consonance of the octave, the fifth, the fourth, and others of this nature; but al ways on condition 
that the mel odi es themsel ves remai n i ntact i n the puré i ntegri ty of thei r form, and that no i nnovation take pl ace 
againsttrue musical discipline.... M ade and promulgated at Avignon in the Ninth Yearof Our Pontificate 
(1324-1325). Corpusjuis canonici. (Hayburn 1979) 



AppendixA 


419 


A.9 GreekAlphabet 

Besidesbeing thealphabetof amajormodern civilization, theGreek alphabet (tableA.1) isuseful 
not onl y for the study of mathemati es but al so for students bei ng rushed for fraterni ti es. 11 may al so 
come i n handy when eating alphabetsoup in Greece. 


Table A.1 

Greek Alphabet 


Alpha 

A 

a 

Iota 

1 

V 

Rho 

P 

p 

Beta 

B 

P 

Kappa 

K 

K 

Sigma 

E 

a 

Gamma 

r 

y 

Lambda 

A 

X 

Tau 

T 

X 

Delta 

A 

s 

Mu 

M 

ti 

Upsilon 

Y 

o 

Epsilon 

E 

e 

l\lu 

N 

V 

Phi 

O 

<t> 

Zeta 

Z 

c 

X¡ 

E 

S 

Chi 

X 

X 

Eta 

H 


Omicron 

O 

0 

Psi 

>P 

V 

Theta 

0 

0 

P¡ 

n 

71 

Omega 

a 

0) 




Appendix 


M athematics is the music of reason. 

- J ames J. Sylvester 

B.l Musimat 

Why did I invent a new programming language when there are so many excellent ones already 
available? The problem is that most programming languages are more general-purpose than is 
required for the relatively specialized purposes of this material, and a proper introduction to such 
a general-purpose language would lead thediscussiontoo farafield. I decided itwould be of more 
Service to readers to specialize the language so that its features would match the examples in this 
book as closely as possi ble. That way, the focus would remain on the subject being coded rather 
than on the language being used to code it. 

N evertheless, M usimat issimilar to other procedural I anguages such as C or C ++ (see Stroustrup 
1991), so if you already know oneof these, itshould beeasy to pick up Musimat. If you don’t 
know any programming language, learning one should be easier after you learn M usimat. 1 
I present everything you’ll need to know about M usimat in thefollowing sections. 

B.l.l Basic Elements 

Virtually all programming languages, including M usimat, sharethefollowing characteristics: 

■ Flowcontrol Specifying the order i n which the steps are to be taken. 

• Data types N aming thekindsof objectsto beoperated on and describíng their behaviors. Types 
of numbers, such as integer and real are common basic data types. 

■ Variables Ñames of places to hold data of various types. 

■ Operators A setof actions that can beperformed on data. O perations li ke "add"," assign", and 
"select" perform well-defined operations on the data. 



422 


Appendix B 


■ Conditional evaluation Making decisions based on circumstances and taking appropriate 
action. 

■ Iteration If an algorithm is to be applied repeatedly to data, for instance, the way Euclid's 
method does, then we need a way to express this. 

■ Recursion If afutureoutputdependsuponacurrentorpreviousoutputaswell aspossibly the 
current inputs, we say that the relationship is recursi ve. 

■ Data structures It is sometí mes necessary to group data into collections, such as sets, lists, 
arrays, and matrices. The types of these data structures can be homogeneous (all alike) or heter- 
ogeneous (a mixed bag). 

■ Named methods When we’ve developed a set of instructions that does something useful, we 
wantto beabletogiveitaname, like "Euclid's method" or "G uido's method." Sinceprogramming 
languages developed out of the mathematics of functions, we use functional notation to represent 
the operation of methods. 

B.1.2 Statementsand Expressions 

M ost methods read, "Do this, then do that." Each "do this" step ¡sastatement. Sequences of state¬ 
ments are read leftto right, then down the page. The elements of each statement, called expressions, 
determine what the statement is about. In many programming languages (including M usimat), 
semicolons (;) sepárate statements. 

B.1.3 Data Types 

To begin, we need only two types of numbers, integer, which is a positive or negative whole 
number, and Real, which is an approximate real number. To keep things simple, let’s assumefor 
ourpurposesthatwehavevirtually unlimited precisión for computations. Aswegoalong, I intro¬ 
duce additional data types as needed. 

B.1.4 Constants 

A constant is any number whose val ue does not change. The number 3 is a constant integer. The 
number 3.14159 ...isa constant Reai. 

M usimat al so predefines two constants, True and Faise, and gives them integer numeric 
valúes of 1 and 0, respectively. 

B.1.5 Variables 

Variables are named places to store data. N ames are indicated by one or more upper- or lower-case 
I etters, I i ke q, n, or f red. AI phabetic case i s si gnificant, so f red denotes a different vari abl e than 
does Fred. N umbers can al so be used i n vari abl e ñames (for exampl e, Fr ed3 3 ), but the fi rst I etter 
of a variable ñame may not be a number. 

Since they physically embody data, variables occupy space and time. Variables flow into 
existence when they are defined, and generally hold their valué until the end of the program 



Appendix B 


423 


unless additional steps are taken to change their valué or to restrict their existence to a certain 
región of the program. 

B.1.6 Reserved Words 

F or the language to be unambi guous, we must reserve the meani ng of certai n words and symbols. 
Reserved words are distinguished by an initial capital letter and are shown using a speciai 
font. Reserved words include If, Whi Ve, Do, Kcr, Repeat, Mise, Halt, Realj: Jthteger, 
Return. Someother symbols are al so reserved. These symbols can't be used for anythi ng but their 
designated meaning. 

B.1.7 Lists 

We can group sets of variables to keep track of thei r relations. A n integerList represents a col- 
lection of integer expressions, for example, 

IntegerList -r*' {1, 1 + 1, 3, 5-1}; 

defines a list ¿i contai ni ng the i ntegers 1 through 4. 

A ReaiList represents a collection of Real expressions: 

RealList rL = {1^%.. 2.2, 3.3, 4.4}; 

Wecan obtain the length of a listof any type. For example, 

Integer n = Length(rL); 

Print(n); 

prints 4. 

B .1.8 O perators and O perands 

The symbols + and - are operators, and the data they act upon are operands. M ost operators take 
twooperands, and the operator lies between the operands, for example, a + b, ande / d. Such 
operators are cal led binaryinfix, meaning that the operator lies between two operands. In itsbinary 
infix form, the Symbol - means subtraction, for example, a - b. The unary prefix - operator 
comes beforethe expression it negates, for example, -3. 

M ultipl¡catión in mathematics is typically expressed by the concatenaron of variables, so for 
instance ai means the productof variables a and t. Butthiscan be ambiguous, becauseaf could 
also referto the single word "at". To avoid ambiguity, theinfix operator * indi cates multiplicaron, 
so the product of m times n is written ® ■* n. 

B.1.9 Assignment 

Wecan assign the valué ofan expression toa vari able using the assignment operator =. For example, 


lhs 






424 


Appendix B 


assignstheexpression rhs to ihs. Theobjecton the right-hand sideof the = sign (i.e., rhs) can 
beany expression. Theobject on theleft-hand side (i.e., íhs) must be a vari able ñame, with one 
exception. For example, thestatement 

s = 3 + 5; 

sets the val ueof variable s to 8. 

T he I eft-hand si de of an assi gnment can al so i ndi cate that a certai n el ement of a I i st i s to recei ve 
the valué on the right-hand side. For example, 

IntegerList iL •**- {0, 1, 2, 3}; 
iL[2] = 13; 

replaces the third el ement with 10 , causing the I i st to become 

■■{£>, 1, 10, 3} 

Note that lists are indexed starting ato. So writing 

it,: Ü | = 55; 

causes thelistto be 
{55/ i, 10, 3} 

B.1.10 Relations 

Reí ati onal operators compare numeri evalúes. For example, in the expression x < y, ¡f y isgreater 
than x, the valué of the expression ¡STrue; otherwisethevalueof the expression ¡SFaise. Other 
relational operations inelude > for greater than, <= for lessthan or equal, and >= for greater than 
orequal. 

Because = has already been given the meaning of assi gnment, we must choose another way to 
express equality, which we do by putting two equals signs together: ==. For example, the expres¬ 
sion x == y is True if x and y havethesamevalue. 

Inequality isexpressed by !=, so for example, x != y ¡STxueifx and y have different valúes. 

B.l.ll Logical Operations 

Logical operators compare truth valúes. For example, the expression x And y is true if and only 
if X == True and y == True. The expression x Or y is true if either X == True or 
•y J== True. 

B.1.12 Operator Precedenceand Associativity 

In the expression a * x + b * y, whatis the orderin which the operations are carriedout? By 
the standard rules of mathematics, we should first form the producís a*x and b*y, then 
sum the result. So multiplication has higher precedence than addition. The natural precedence of 



Appendix B 


operations can be overridden by the use of parentheses. For example, a * (x + b) * y forces the 
summation to occur before the multiplications. 

In the expression a + b + c, we first add a to b, then add the result to c, so the associativity 
of addition is left to right. We could express left-to-right associativity explicitly like this: 

( ( (a) + b) + c). 

The rules of precedence and associativity in programming languages can be complicated, but 
the programming examples in thissection use thefollowing simplified rules. 

Expressions are evaluated from left to right, except 

■ M ultiplications and divisions are performed before additions and subtractions. 

■ All arithmetic operations (+, *, /) are performed before logical operations (And, or, 

■ Parentheses overridethe above precedence rules. 

For details, seesection B.3. 

B.1.13 TypePromotion andTypeCoerción 

W hat i f the val ues i n an expressi on are not of the same type? F or exampl e, si nce both operands i n 
the expression 2/3 are integers, thequotient will be an integer. The quotient of 4.5/2.25 will 
beareal numberbecause both operands are reais. But whatis the quotient of 2/2.25? O uroptions 
are to coerce the numeratorto be a Real and then perform real división, or coerce the denominator 
to bean integer and then perform integer división. Which shall it be? 

S i nce the set of al I real s i ncl udes the set of al I i ntegers, i t makes sense to promote the i nteger 2 
to the correspondíng real valué 2.0 and then perform real división. M usimat automatically con- 
verts 2/2.25 into 2.0/2.25 and then performsreal división. In general, i nteger val ues are auto¬ 
matically promoted to reais wherever they occur in an expression with reais. 

If automatic type promotion is not desi red, the type of an expression can be coerced by directly 
indicating itstype. Consider the expression: 

lO/Integer(3.33) 

First, the real valué 3.33 istruncated to the i nteger valué 3, then because both numeratorand 
denominator are now ¡ntegers, i nteger división i s performed. Bewareof things being done for you 
automatically by computers! You still must pay attention to head off unintended consequences. 
Consider: 

26/Integer(2.5) 

equals 13, but 

Integer(26/2.5) 

equals 10. 



426 


Appendix B 


B.1.14 AccessingListElements 

We can access an element of a list using the Índex operator []. Suppose we have thefollowing 
declarations: 

Integer w=l, x = 2, y=3, z = 4; 

InteqorLi st jd&wF {w, x, y f: z }; 

Then the statement 

Integer c = iL[0]; 

assigns c the same valué as w (which is 1). The statement 

c = i T. [ 3 | ; 

assigns c the same valué as z (which is 4). 

ThefirstelementonaList is indexed byO,and if a List has/V elements, the Iast one is indexed 
by N - 1. 

B.1.15 Functions 

Functional notationinmathematicsallowsustoencapsulateand namearithmeticexpressions. For 
example, ifwe have defined the function f(a,b)=a + Mhenf stands for a+ b. The valué or valúes 
i n parentheses af ter a f u ncti on ñame, calledarguments, supply the function with i nputs. Functions 
also typically return a result. For example, using this definition of f, 7 = f( 3,4). 

Programming languages typically come with a set of predefi ned functions for the mostcommon 
necessities, and they also allow new functions to becreated. For example, in M usimat all oper- 
ators also have a functional representad on, so writing 

Real x = Divide(11.0, 4.0); 

is the same as writing 

Real x =f|É. 0/4.0; 

In this case, x is set to the quotient, 2.75. Real división is performed because both numerator 
anddenominatorare reais. Ifwewanttoperformintegerdivision, both numerator and denominator 
must be integers. Wecould write 

Inteqer x = Divide(11, 4); 

orequivalently, 

Ittteqe r x =■ Ti / 4; 



Appendix B 


427 


I n either case, x issetto the quotient, 2, and theremainderisdiscarded.Togettheremainderafter 
integer división we can write 

Integer x = Mod( 11, 4 ); // remainder of integer división 

The variable x is set to 3, the remainder of 11/4. For positive integersm and n, Mod(m, n) 
lies between 0 and n -1. W hy is thisfunction called Mod instead of, say, Remainder? Seeappen¬ 
dix A, A.6. The equivalent operator form for remaindering al so looks a little strange: 

Integer x = 11 % 4; 

The%signdoesnothaveitsusual meaningof "percent" in M usimat. Instead,itmeans"remain¬ 
der of integer división." Mod and % can only be applied to integer operands. 

Someuseful built-in functions arenotassociated with operators. Exponentiation isperformed 
by the function pow (). These three statements, 

Real base = 10.0; 

Real exp = 2.0; 

Real x = Pow(base, exp); 

are equivalent to writing x = 10.0 2 0 , and the result stored in x is 100 . o. Going the other way: 
Logio (x) is equivalent to log 10 x, and 

Real y = LoglO(100.0); 

sets y to 2. 

Another built-in function, Print o , al lows usto observe the valué of a variable orexpression. 
W hen executed, the statements 

Real x =t|$. 0/4.0; 

Print(x); 

display the valué of x, or 2.75. The way in which the valué is displayed varíes with the type of the 
expressionandthetypeof Computer. If the Computer is a person, for example, heorshemightsay "two 
pointseven five." If itisan electromechanical Computer, itmightshow the val ueon a display screen. 2 

When the predefined function Hait |j is executed, the method in progress stops at thatstep in 
the program. T he argument to Hait (), if any, can be used to i ndi cate the answer or result obtai ned 
by the program up to that point. 

One final built-in function is Random (), which returns a real number in the range of 0.0 to 1.0 
chosen at random. 

B.1.16 Conditional Statements 

A mathematical notation for determining thesignof a number is 



Appendix B 


which sets y to a if x is negative; otherwise itsetsy to b. Such reí ational expressions are called 
predicates. Musimat accomplishesthesamething likethis: 

If (X < 0) 
y = a; 

Else 
y = b; 

I n this example, y receivesthe valué of alfxls lessthan 0; otherwise y receives the valué of b. 
The Eise partof this construction isoptional. So for example, 

If (a < b) 

Print(a) ; 

prints a only if itis lessthan b. íf and Eise can becombined to allow chainsof predicates: 

If (x < 0) y/ is x negative? 

Y = a; 

Else If (x == 0) // it's not negative, but is it zero? 

Y = b; 

Else // neíther negative %dsí; zexa f x must be pos^sfve 

y = c; 

B.1.17 Compound Statements 

S uppose we need to do more than onethi ng dependí ng on the val ue of a predi cate. I f we need to execute 
múltiple statements that depend upon a common predícate, we can group them together into a list of 
statements. For example, = r; } is a list of statements, also called a compound state- 

ment. Consider steps 2 and 3 of Euclid’s method (see section 9.2.2), which can beexpressed 

If 

Halt(n); 

Else { 


If r is not equal to o, first m is assigned the valué of n, and then n is assigned the valué of r. We 
express this in M usimat by making thesetwo steps into a compound statement. 

Any legal statement can appear withi n acompound statement, including other compound state¬ 
ments. This means we can nest compound statements inside each other. 

B.1.18 Iteration 

We must be able to repeat a statement or statements multi pl e ti mes. For example, E ucl id's method 
returns to step 1 from step 3, depending upon the valué of variable r (see section 9.2.2). In 



Appendix B 


M usim at, the Repeat statement causes a statement or compound statement to repeat intermina- 
bly. This allows usto implement Euclid's method asfollows: 

Repeat{ 

r = Mod(m, n); //remainder of m divided by n 
-iré (r == 0) { 

Halt (n); // halt, and give answer n 

} Else { 


} 

} 

This code shows an example of nested compound statement lists. The bare syntax of this example is 

Repeat {... If (...) {...} Else {...}} 

and the compound statementsfollowing if and Else are nested inside the compound statement 
following Repeat. We can nest compound statements as deeply as we desi re. 

Sinceitnever stops by itself, theonly way to termínate a Repeat statement iswith a Halt state¬ 
ment. 3 It's a crude but effective technique; however, there are more elegant ways to decide how 
many times to repeat a block of statements. The Do-whiie statement allows us to specify a ter- 
mination condition that is evaluated after the body has been executed. Here is an example that 
prints the random valué assigned to x and repeats for as long as x is less than o. 9. 

Real x; 

Do { 

x = Random(); // choose a random. valué between 0.0 and 1.0 

Print(x); 

} Whiif, (x < 0,9); 

Because Random ( ) returns a uniform random valué in the range 0.0 to 1.0, its valué will be less 
than 0.9 on average 90 percent of the ti me. It is possi ble, though unlikely, that this statement would 
print itsvalueonly once, and it ¡sal so possiblethatitcould printdozens, even hundreds, oftimesbefore 
halting, depending upon the particular sequence of random numbers returned by Random (). 

The For statement al so implements a way of repeating a statement or compound statement a 
number of times, but it allows us to directly manage the valué of one or more variables each time 
the statements are executed and to use them to determine when to stop. This example prints the 
integers between 0 and 9: 

Integer i; 

For' fi = < 10; <= i + 1) 



430 


Appendix B 


The variable i is called the control variable. The example first sets i to o, then tests if i < 10. 
Since o < 10, the Print o statement is executed. Next, the For statement executes the state- 
menti = i + i, which adds 1 to the valueof i. So now i equals 1. Again, theFor loop tests 
if i < ío.andsincei < io, it executes Print o again. It again adds lto the valueof i. So now 
i equals 2. This process continúes until i == 10, whereupon the For loop terminates because 
then i < 10 ¡SFalse. 

The For statement is a I ittle twisty, so let's take a more careful look at its operation. I n general, 
we can ñame the parts of the For statement as follows: 

For (i ni t i al i zat i on; test; c han ge) 
st at ement 

where statement can be a si ngle statement (terminated by a semi colon) or a compound statement 
(enclosed with curly braces). The For statement fi rst executes the i ni t i al i zat i on code, then 
evaluatesthebooleanexpressiontest. Ifthevalueoft est ¡SFalse, the For statement terminates. 
If the valué of t e s t is True, the statement i s executed, then the change expression i s executed, 
and fi nal ly the test i s eval uated agai n. I f the val ue of t e s t is Faise, the For statement termi nates. 
If the valueof test is True, the cyderepeats again and again until the valueof test is Faise. 

Asa convenience, it is possi ble to define and set the valueof the / ni t i al i zat i on variable 
in onestep, so the preceding example could have been written 

For (¡liiegeíS i;;** %■ i < 10; í- ‘¡fr :1) 

Print(i); 

B.1.19 User-Defined Functions 

M usimat, like most programming languages, allows users to define their own functions. Take 
Euclid’s method, for example. To define it, we must State how the input variables m and n receive 
theirinputs, and determine whathappensto theresultwhen the method halts. Wecan defi neafunc- 
tion named eu«||d() in M usimat as follows: 

'•lÜyteger euclíd (Integer mS^nteger n) { 

Repeat { 

Integei" it?, = Mod (m, n); 

,T| Tr == ’ 

Return(n); 

Else { 



The function is declared to beof type integer because it wi II return an integer result. 



Appendix B 


431 


NotethatReturn (n) hasbeensubstitutedforthe Hait (n) function shown previously. Instead 
of haltingexecutionaltogether,theRetum (n) statementonlyexitsthecurrentfunction,carrying 
with itthe val ue of its argument back to the context that i nvoked i t. T he program can then conti nue 
executi ng from there, if there are statements fol lowi ng its i nvocation. H ere's an example of i nvok- 
ing the eucüdo function: 

Integer x = euclid(91, 416); 

Print(x); 

whichwill print 13. Ifwehad used Hait o ineuciido, we'd neverreach the Print statement 
because the Computer would stop. 

H ere’s another way to compute the same thi ng: 

Print(euclid(91, 416)); 

Thisway wecan el ¡mínate the "middleman" vari able x, whichonly existed tocarry thevaluefrom 
the euclid () functionto the Print o function. Inthisexample, thecall totheeucilao func¬ 
tion is nested within the psííiíc o function. M usim at invokes the nested function first, and the 
valuethat e.üc$pf o returns is supplied automatically as an argument to the enclosing function, 
f cint o. Functions can be nested to an arbitrary extent. The most deeply nested function is 
always called first. 

B .1.20 I nvoking F unctions 

We had two situationswhere the function euclid o wasfollowed by a list of arguments, once 
where it was defined, and another where it was i nvoked. The arguments associated with the def- 
inition of euclid o are called its formal arguments. They are integer m and integer n. 
The valúes associated with itsinvocation (integers 91 and 41 6) are called its actual arguments. 
A function will have only one set of formal arguments that appear where the function is defined. 

11 wi 11 have as many sets of actual arguments as there are i nvocati ons of the functi on i n a program. 
When a function is ¡nvoked, threethings happen: 

1. The valúes of the actual arguments are copied to the correspondí ng formal arguments. 

I f an actual argument i s a constant, its valueissimply copi ed to the correspondí ng formal argument. 
Example: Print (3) copies 3 to the formal argument for print o. 

I f an actual argument i s a vari abl e, i ts val ue i s copi ed to the correspondí ng formal argument. E xampl e: 
integer a = 3; spiBt (a) copies the valué of a (which is 3) to the formal argument for 

Print(). 

If an actual argument is another function, that function is evaluated first, and its return valué 
replaces the function. Example: For the statement Print (euciid(9i, 416)) , first 
euclid (9i, 416) is evaluated, and the result (which is 13) is substituted in its place. So the 
statement becomes Print (13) . Finally, the 13 is copied to the corresponding formal argument 
of the Print o function. 




432 


Appendix B 


2. The body of the function isexecuted using the valúes copied to the formal arguments in the 
fi rststep. 

3. The return valué of the function is substituted for the function cali in the enclosing program. 

B.1.21 Scopeof Variables 

A function's formal arguments are said to have local scope becausetheyflow into existencewhen 
the function begins to executeand cease to exist when the function isfinished. It is al so possi ble 
to declare other variables within the body of a function. For example, this function defines a local 
variable named sum: 

Integer add(Integer a, Integer b) {//return the sum of a plus b 
Integer sum = a + b; 

Return(sum); 


L i ke the formal arguments a and b, the scope of sum is I ocal to the functi on add (). T hey di sappear 
when the function exits. The only thing that persists is the expression in the Return statement, 
which ispassed backtothe caller of the function. 

Local variables can al so be declared within compound statements. For example, 

If (x > 10 And Y < 10){ 

Integer sum = x + y; 

Print(sum); 

} 

These variables disappear when the compound statement is exited. 

Vari abl es decl ared outsi de the scope of any functi on are cal I ed global variables. T hey are acces- 
siblefrom the pointthey are declared until theend of the program. They are said to have global 
scope. 

B.1.22 Passby Valuévs. Pass by Reference 

Global variables can be accessed directly within functions. For example, thisfunction returnsthe 
difference of global variables x and y. 

Integer x = 2; // x is a global variable 

Integer y = 3; // y is a global variable 

Integer subxy() {Return (x - y);} 

Referencing global variables directly inside a function is not a recommended practicebecause 
itties the function to particular individual variables, limiting its usefulness. 

Thereason peoplearetemptedto reference global variables directly insidefunctions isthat ordi- 
nari ly al I that returns from a functi on i s the expressi on i n its Return () statement. Sometí mes, i t's 




Appendix B 


433 


niceto allow a function to have additional side effects. That way functions can affect morethan 
onething at a time in the program. B ut there's a better way to accomplish side effects: we can use 
arguments to pass in a reference to a variable from outside. 

Asdescribed in the preceding section, ordinarily only the valué is copied from an actual argu- 
ment to its corresponding formal argument. But declaring a formal argument to be of type 
Reference causes M usimat to let the function directly manipúlate a variable supplied as an 
actual argument. The function doesn'tget the val ueof the vari able, itgets the variable ¡tself. When 
a function changes a Reference formal argument, it changes the variable supplied as the actual 
argument. 

Wecan use Reference arguments to allow functions to have múltiple effects on the variables 
in a program. Forexample, let's declare a function that takestwo Reference arguments and adds 
lOtoeach oftheir valúes. 

addlO(Reference a; Reference b){ 
a = a + 10; 
b = b + ICtjt. 


Now let's declare two global variables with initial valúes: 

Integer x = 2; 

Integer y = 3; 

Now let's usethem as actual arguments to the function and then print their valúes: 

addlO(x, y); 

Print(x); 

Print(y); 

This prints 12 and 13 because the function changed the valúes of both global variables. This is a 
very handy trick. 

H ere are the rules to remember: 

■ An ordinary (non-Reference) formal argument provides its function with a copy of its actual 
argument. C hanging the valué of an ordinary (non-Reference) formal argument i nsi de the func¬ 
tion does notchange anything outside the function, that is, such arguments have local scope. The 
actual arguments aresaidto be passed by valué to the formal arguments. 

■ A Reference formal argument provides its function with direct access to the variable named 
as its actual argument. The actual argument must be a variable. M odifying the valué of a 
Reference argument inside a function changes the referenced variable outside the function. 
Thus, the scope of a Reference formal argument is the same as the scope of its actual argument. 
The actual arguments are said to be passed by reference to formal arguments when they are 
declared tO be Of type Reference. 






434 


Appendix B 


B.1.23 TypeConversión 

We can explicitly convert integer expressions to Real, and vi ce versa. For example: 

Real a .0/3.0; 

Print(a) ; 

prints 3.333 . . . ,and 

¡tateger b * Integer(a); // convertí t® Integer 
Print(b )f 

prints 3. 

W hen assigni ng a to b, the Real val ue a i s converted to an integer by truncad ng (discardi ng) 
the fractional partof a (that is, by discarding 0.333...), and the integer residue (3) isassigned to 
b. If wethen write 

Real c = Real(b); 

the integer valué of b (which is 3) is converted to the equivalent Real valué (3.0), which is stored 
in Real variable c. 

Convertí ng from Real to integer, we have somechoices. For example, if 

Real a 0/3.0; // Real variable a is set to 3.333 . . . 

then 

Real d = Floor(a); // d is set to 3.0 

sets d to 3.o. Thebuilt-in Fioor o function returnsthelargest integer lessthan itSReai argu- 
ment. The statement 

Real x = Ce:¡ ling (a) ; 

sets x to 4 because the built-in cei lingo function returns the smallest integer greater than its 
argument. 

Wecan round aReai to the nearestwhole number as follows: 

Real r = Fioor(a +0.5); // round c to the nearest whole numbeir 

If a = 2.4, then Fioor (a + o.5) returns 2. o. But if a = 2 . 5 , f] per (a + 0.5) returns 
3 . 0 . Fioor (a + o.5) returns 5 : ,'j0 for any valué a in the range 2 . o to 2 . |fi ... and returns 

3 . o for any valué a in the range 2 . 5 to 2 .999 _ B ut we don’t have to do rounding ourselves, 

Musimat has a built-in function: 

Print (Round (2 .49999) ) ; // prints 2.0 




Appendix B 


B.1.24 Recursion 

Recursionmeansreferringback toa valué we've cal culatedpreviously. Consi derthe factorial oper- 
ation where 5! means 5 • 4 • 3 • 2 • 1. We could use a Por statement to calcúlate factorials. This 
function calculates factorials using iteration: 

.IE#teger factoriai (Integer x) { 

Integer n = 1; 

For (Integer i=x; i>l; i = i - 1) 


Return(n); 


The statement 

Print(factorial(5)); 

prints 120 . 

Westartwithn = íandi = s.TheFor loop takesthe previ ous valué of n, multiplies it by the 
current valueof i and reassignsthe valueto n. Itthen decrements i and performs the operation 
repeatedly so long as i > i. 

B.1.25 Recursi ve Factorial 

H ere is a more directapproach to computing factorials using recursion. Since x! = x- (x- l)!,we 
can write 

Integer factorial(Integer x){ 

U (x == ^ 

Refuta (1) ; 

Else 

Return(x * factorial(x - 1) ) ; 


This method has two States. If x pN- i, we return 1 since 1! is equal to 1. Otherwise, we return 
x multiplied by the factorial of x - i. Consider the statement 

Print(factorial(5)); 

When thefactorial function is cal led, x is assigned the valué 5. Becauses is not equal to 1, thefac- 
torial function evaluates the Eise statement and calis factorial <4 ). Because 4 is not equal 
to 1, the factorial function evaluates the Eise statement and calis factorial (3) , and so on. 
Eventually, wereach facton-ai, (i) , which returns 1, which is multiplied by 2 , then by 3 , then 
by 4, and finally by 5. Thetop-level factorial o function returns the product, 120 , to the 
Print o routine. 




436 


Appendix B 


B.1.26 Fibonacci Numbers 

In the sequence 

1,1, 2, 3, 5, 8,13, 21, 34, 55, 89,144, 377, 610, 987,1597, 2584,... 

each subsequentterm isthesum of itstwo immediately preceding valúes. Forexample, 8=5+3. 
This series, invented by Leonardo Pisano Fibonacci (1170-1250), isthesolution to a problem he 
posed i nhisbook Líber Abad: "A certain man puta pairof rabbits in a placesurrounded on all sides 
by a walI. H ow many pairs of rabbits can be produced from that pair in a year if it is supposed that 
every month each pair begets a new pair which from the second month on becomes productive?" 
Here i san iterative method of computing the Fibonacci sequence: 

Integer iterFib(Integer n) { 

Ir.tegc r f.ftl. = 1; 
intege r fftX = 1; 

Tntege r result = jl; 

Fot (lnt<sge:&•■$>. = 2; i < n; i, i + 1) { 

result •+ fn2; 

fn2 = fnl; 
fnl = result; 


Return (result); 


Executing thisFor loop, 

For (Integer i = 1; i < 10; i++) 

Print(iterFib(i)); 

prints the sequence 1,1, 2, 3, 5, 8,13, 21, 34. hiere is a method that accomplishes the same 
calculation using recursion: 

Integer recurFib (Integer n) { 

If (n Or á- : *== %) 

Return (1) ; 

Else 

RetüJjjS (recurFib (n - 1) + recurFib (n -2)); 


The recursi ve technique has cri sper expressive power than the iterative approach because we see 
the inner structure of the sequence directly i n the method of its construction. However, it is com¬ 
putad onal Iy much more expensi ve, especi al ly for I arge n, because we must cal I the recurFib () 
method twice at each step, whereas iterFib o performs only one addition and minor data 





Appendix B 


437 


shuffling ateach step. Here ¡san examplewhereKnuthVgoodness" criterion dependsupon con- 
text. If efficiency isparamount, the iterativeapproach is preferred; recursion is preferable if expres- 
sive crispness is most important. 

The F i bonacci sequence becomes reí evant musical lywhenwe examine the rati os of subsequent 
terms: 

1 2 3 5 8 13 
1' 1' 2' 3' 5' 8' " 

The corresponding sequence of quotients is 

1, 200,1.500,1.670,1.600,1.625,1.619,1.617,1.618,... 

Thus we see that the rati o of adjacent F i bonacci numbers converges rapidly to the valué of the 
golden mean, O = 1.618... .TheGreek letter phi, O, iscommonly used to stand forthegolden 
mean. Thisnumberappearsin a widerangeof natural designs, i ncluding the arrangementsof petáis 
in flowers, seed clusters, and pine cones. Studied at least since Euclid wrote his Elements, the 
golden mean hasappeared consciously and unconsciously as a central design element in countless 
musical works (see section 9.16.1). 

B.1.27 Other Built-in Functions 

M usimat ineludes standard mathematical functions such as sqrt (x) = Jx. Therearetrigono- 
metric functions such as sin (x), eos (x), and Tan (x). Arguments to trigonometric functions 
are in real radian valúes. Speaking of radian measure, here's an interesting way to compute n to 
the machine precisión of your Computer: 

Constante Real Pi = Atan (1.0) * 4.0; // afetangent■ í|||fj§¡í times 4 equals P|: 

The function Abs (x) returns the absolute valué of its argument. It works for either Real or 
irtteger expressions. For instance, both of thefollowing statements will print True: 

(Abs (-5) == Abs (5)) Priste (True) ; Else Pr'isit (False) ; // : iJÍjteger Abs ( ) 
If (Abs (-5.0) == Abs (5.0)) Ptxmifxue) ; Else Print (False) ; // Real Abs ( ) 

With no arguments, the built-in function Random () returns a random value between 0.0 and 1.0, 
but if Random () is given arguments specifying Real lower and upper bounds, it returns a Real 
random valué between those boundaries. For example, 

Real x = Random(0.0, 11.0); 

returns a random Real valué i n the range 0.0 < x < 11.0. Note the rangeisfrom 0.0 to al most 11.0. 

If Random o is given arguments specifying integer lower and upper bounds, it returns an 
Síiteger random valué between those boundaries. For example, 

JSftteger x = Random (l$y v .¿LÍ) ; 




438 


Appendix B 


returnsarandom integer valué i n therange 0 <x < 11. N ote therangeis inclusive f rom Oto 11. 

B.1.28 Comments 

Itisalways helpful to readers if programmers inserí comments into their programs. In M usimat, 
any text beginning with two slashes // out to the end of the line is commentary. For example: 

x = a + b; // this text is commentary 

Sometí mes i t's usef ul to be abl e to put a comment anyw here, even i n the mi ddl e of an expressi on. 
All commentary between /* and */ is ignored. 

x = y /* this commentary is ignored by Musimat */ + z; 

When the expressi on is eval uated, all commentary is ignored, so the resulti ng expression ¡Sx = 
y + z;. Commentary between /* and */ can extend over múltiple lines of text, as necessary. 

B.1.29 RepresentingText 

I n orderto pri nttext,weuseadatatypecalledcharacter, which consi stsof the lettersof the Román 
alphabet, digits from 0 to 9, and some nonpri nting characters liketab, white space, and punctuation. 
Characters are written in singlequotes: ’a\ 'b\ 'c\and so on. Punctuation marks inelude ' ■ 
(blank), ', ■ (comma), ' 1 (semicolon), and 1 .' (period). We can spell words and sentences by 
making I i stsof characters, for example { 'c, 'u\ 'i', 'd', 'o'}, but this would beexces- 
sively tedious. A shorteut for lists of characters is another data type called string. For example, 

String c = "Ut queant laxis resonare"; 

This string is equivalentto, and much simpler than, assembling a list of characters. 

Computers opérate with binary numbers, not alphabetic letters. So wemustassociateeach char- 
acter we want to display with a unique binary number. The Computer operates only on the binary 
numeric valúes; the display screen connected to the Computer knows how to convert binary 
numeric valúes to the correspondí ng characters for display. 

Weneed atable listing the association between particular binary valúes and the correspondí ng 
printed characters. This table is called a character set. W hen a key is pressed on a Computer key- 
board, the keyboard looks up the corresponding binary number in the character set and sends it 
to the Computer. The Computer forwards the number to the display screen, which al so uses 
the character setto determine which characterto display. Only the keyboard and the screen use the 
character set; the Computer just stores the corresponding binary numbers. 

I nternational standard ISO-10646 defi nesa U niversal C haracterSet, commonly cal led U nicode. 
Tokeepthingssimple, Musimat usesacommonsubsetofUnicodecalledASCII (seesection B.2). 
The bui It-i n character () function takes an A SC11 character code as its argument and returns the 
corresponding printable character. 


Print(Character(65)); 





Appendix B 


prints the character 'a\ The integer o function can takea printable character as its argu- 
mentand return the correspondíng ASCII character code. Forexample: 

Print(Integer('A')); 

prints 65. 

B.2 Music Datatypesin Musimat 

This section describes the design of music data types avai I able i n M usiMATforrepresenting pitch, 
rhythm, duration, frequency, and loudness. 

B.2.1 Pitch 

Wewould ideally liketo havea uniform way to representan pitch Systemsdiscussed in chapter3. 
It would be convenient if we could do arithmetic on pitches, for example, to find the size of an 
interval by subtracting two pitches, to calcúlate the frequency of a pitch, or to get the pitch of 
a frequency. 

Sol vi ng the si mpl est probl em fi rst, I desi gned a data ty pe for the equal -tempered scal e usi ng the 
piano keyboard. This can be generalized to other scal es. The gamut of a standard piano keyboard 
is 88 keys, indexed from 0 to 87, lowest to highest. We start by associating each key number with 
a ñame. The lowest pitch on standard pianos is ao, corresponding to key 0, and the highest pitch 
is es, corresponding to key 87.1 nterval sizein degreesisthedifferencebetween key indexes. For 
example, c4 iskey48andF4 is key 53, so the i nterval c4- F4correspondstofivesemitones,which 
is the diatonic interval ofafourth. 

M usimat comes with a built-in datatype called pitch. By default, itassumes 12 degrees per 
octave, but the degrees can correspond to any frequencies, so forexample, it can beused directly 
tocreateanydodecaphonicscale. Italsocan beadjusted to handlescaleswith otherthan 12 degrees 
per octave. 

By default, thepitch data typeemulatescommon musical notationconventionsregardingscale 
degrees, interval sizes, and transposition. Forexample, the pitch as4 (pitch class A» in thefourth 
piano octave) is defined as 

Pitch As4 = Pitch(9,1,4); 

The fi rst number, 9, represents the diatonic degree as the number of semitones above C. 
Diatonic pitch A istheninth semitone above C (see figure B.l). The second number, 1, indi cates 
the accidental. In this case, the A is sharped (raised by a semitone). The chromatic scale 
degree is obtained by addi ng the diatonic scale degree, 9, and the accidental, 1, which for A¡ 
yields 10 (see figure B .1). The third number, 4, indicates the octave on the standard piano 
keyboard. 



440 


Appendix B 



0123456789 10 11 


Figure B.l 

Diatonic degrees expressed as chromatic pitch classes. 

ThechromaticdegreesfrorriAO toes arepredefined in Musimat in both fíats and sharps. Since 
As 4 and Bb4 representthe same chromatic degree, the statement 

Print(Bb4 == As4); 

prints True. In general, pitch isdefined by thetripie(p/tch-c/ass, accidental, octave ), where 
pitch-class is an integer from O to N, and N is the number of degrees in an octave. 

In defining the pitch A»4, the triple o, i, 4) is assigned to the variable as4 . Variable as4 
containsthesethree valuesas onecompound entity. This compound valuecan be passed from one 
pitch variable to the next. For example, the statements 

Pitch x = As4; // assign As4. t© x 
Print(x==As4); 

print True. Arithmetic can be performed on pitches to sharp or fíat them. For example, 

#;£int(A4 + i) prints as4, and Prdrít (A4 - 3) prints Gb4. Si mi larly, Priíit (A4 * 3) 
prints ci2, and Print (A4/3) prints ei. 

Each element of a Pitch can be accessed using these built-in functions: 

pitchciass (Pitch p) Returns the diatonic pitch class. For example, if p ísas4 , 9 

is returned (see figure B.l). 

Accidental (Pitch P ) Returns the accidental as an integer, where o is natural, 

negative valúes are increasingly fíat, and positive valúes are 
increasingly sharp. For example, if p ísas4 , 1 is returned. 
octave (Pitch p) Returns the octave on the piano keyboard. For example, if 

piSAs4, 4 is returned. 

These elementscan be used to determine the piano key Índex correspondíng to a particular pitch: 

Tnteqer key(Pitch p) ( 

fjsteger pe = PitehClass (p) ; // Creí®, O . . 11 

Integer acc = Accidental (p) ; // -l=flat, 0=natural, ‘j|i=sharp 

Integer oct = Octave (p) ; // frota O .. 8 

Return ( (pe + acc) + 12 * (oct - 1) +3);// combirj# 



Appendix B 


441 


A way to think abouttheexpression intheRetum o statementisasfollows. Say wewanttofind 
the piano key Índex for ao. We know it'sthe bottom noteon the piano, so itshould return an Índex 
valueof o. The triple of ao is o, o, o ). The expression intheRetum o statementequalsofor 
thistriple. Similarly, thetripleof A4 is o, o, 4) , and itscorrespondíng key Índex is 48. 

Equal-Tempered Frequency Pitch provides a representadon of scale degrees and does not 
denotefrequency. Wecan convertto frequency using any scale System welike, beginning with the 
equal-tempered scale. We can compute the equal-tempered frequency of a pitch, assuming a 
reference such as A4 equals 440 Hz, by adapting equation (3.3), f kiV = f R ■ 2 v+k i u , to compute 
hertz valúes from chromatic scale degrees: 


Real piftc&ToHz(Pitch p){ 
Real R = 4 4Q.(Jf 
Real key = PitchClass(p) 
Real oct - Octave(p); 
Return(R * Pow(2.0, (oct 


// reference frequency 
Accidental(p); // get key Índex 
// get octave 
4| + (key - 9) / 12.0)); 

// retuÉ# .frequency 


A way to think about the expression in the Return o statement is as follows. The reference 
pitch is 440 Hz, corresponding to A4. So we want the valué returned from thisfunction to equal 
440.0 when p ísa4 . Thetriplefor A4 is (9, o, 4 ) , so when pitchToHz o is called with A4, 
wewantto evalúate f R ■ 2 o , which can beachieved by subtracting 9 from the pitch and 4 from the 
octave. Then, executing 

Print(pitchToHz(A4)); 

prints 440.0, and substituting any other pitch, regardless of how it is spelled, will produce its 
proper hertz valué. For example, ao is 27.5 Hz, C4 is 261.63 Hz, and es is 4186.01 Hz. 

What if we have a frequency x in hertz and wantto find its corresponding pitch? The problem 
is thatx may lie in between the pitches of the scale becausex can be any frequency. One approach 
is to compare x to each semitone on the keyboard from lowest frequency to highest, and to stop 
when the keyboard frequency exceeds x. Then the key one semitone below is the closest corre¬ 
sponding pitch on the keyboard. 


Pitch hzToPitch(Real x) { 

Fór (Tr.Leqer js *=> 9+1; k < 88 + 9; k 
Pitch p = (k)'í' 

Real f = pitchToHz(p); 
if (f > x) 

Return(p - 1); 


/7 JMhd pitch closest to x Hz 
= k + 1) {// test from AsO to C8 

// get pitch of k 
0 get frequency of p 
lf' have we passed our target? 
return previous pitch 






442 


Appendix B 


-;// |^-:we get here, tb©-Mz valué flíé-x is beyond the end ©C the keyboard 
Return(C8); // out of range, clip at C8 

} 

This code returns ao ¡f x is lowerthan or equal to ao, and it returns es if x isgreaterthan or equal 

to C8. 

Listsof Pitches Wecan collect pitches into lists: 

PitohList shave (C5, G4, G4, Ab4, ••¿'4> B4, C5); // shave and a’&grijjÉut, 2 bits 

Wecan do arithmetic on all the pitches in a list. To transpose this pitch listup a wholestep, 

Print( shave = shave + 2 ); 

addstwo degreesto every pitch in shave, and prints {D5, A4, a4, as4, a4, cs5,. tos}. 
To transpose by an octave, 

Print( shave = shave * 2 ); 

multipliesevery pitch in the list by 2 and prints {D6, as, as, as5, as, cs6, d6}. 

Pythagorean Chromatic Scale We can compute the frequency of a pitch in Pythagorean 
chromatic tuning, assuming areferencesuch asA4 equals 440 Hz. Westart by computing the fre¬ 
quency of Pythagorean middleC from the reference frequency, using equation (3.11). We define 
the reference frequencies in M usimat as follows: 

Real R = 44Q.0; 

Real cPi4 = R * 16.0/27.0; // Pythagorean middle C, 260.74 Hz 

Next, referring to figure 3.7, we tabúlate the ratios of the Pythagorean chromatic scale in 

M USIMAT USing a RealList: 

RealL^pi pythagoreanChromatic ( 

;l.0/1.0, 256.0/243.0, 9.0/8.0, 32.0/27.0, 

81.0/64.0, 4.0/3.0:, 1024.0/729.0, 3.0/2.0* 

128.0/81.0, 27.0/". 6.0, 16.0/9.0, 243.0/128.0 
) ; 

Last, we define a variation of thepítchToHz o function. This versión has the same ñame but 
takes three arguments instead of one. 4 When supplied with a certain pitch p, it returns the fre¬ 
quency corresponding to its Pythagorean intonation asa Real valué in hertz. 

Real pitehToHz( 

Pitch p. 

Real refC, 


// pitch 

// reference frequency 




Appendix B 


443 


RealList scale // %¡m£$ros of scale degrees 

I { 

Integer key = PitchClass(p) + Accidental(p); // get key from pitch 
Real oct = Octave(p); // get octave from pitch 

Return(refC * scale[key] * Pow(2.0, (oct - 4)));// compute frequency 


The Return ( ) statement calculates the frequency of the key from the reference frequency 
times the ratio forthat degree, then adjusts itforthe proper octave. Calling 

Print("A4=", PitchToHz(A4 , cPi4, pythagoreanChromatic)); 

printSA4=4 4 0.o, and 

Print("C4=", PitchToHz(C4 , cPi4, pyhagoreanChromatic)); 
prints C4=260.74, asexpected. 

Natural Chromatic Scale Tocreate the natural chromatic scale, alI weneed now isto establish 
the frequency reference for natural chromatic middleC and tabúlate the ratiosof the scale. 

Real R = 440.0; 

Real cNat4 = R * 3.0/5.0; //264.00 Hz 

RealList naturalChromatic( 

1.:0/1.0, 16.0/13.0, 9.0/8 .f 4 6.0/5.0, 

5.0/4.0, 4.0/3.0, 64.0/45.0, 3.0/2.0, 

8.0/5.0, 5.0/3.0, 16.0/9.0, 15.0/8.0 

) ; 

Then 

Print("A4=", PitchToHz(A4 , cNat4, naturalChromatic)); 
prints A4=440.o, and 

Print("C4=", PitchToHz(C4 , cNat4, naturalChromatic)); 
printSC4=264.00. 

Sruti Scale As a final example, weadapt pitch to handlenondodecaphonic scales by demon¬ 
strad ng the sruti scale (seefigure 3.25). There are 22 degrees inthis scale. Westartby defining the 
ratios of the sruti scale: 

RealLis'í, .si^a^tScale ( 

1,0/1.0, 256.0/243.0, 16.0/15.0, 10.0/9.0, 9.0/8.0, 32.0/27.0, 6.0/5.0, 
5.0/4.0, 81. 0/64.0> 4.0/3.0-, 27.0/20.0,45.0/32.0, 729.0/512.0, 





Appendix B 


3.0/2.0, '^|8.0/81.0, 8.0/5.0, 5.0/3.0, 27.0/16.0, 16.0/9.0,, 9.0/5.0, 
15.0/8.C, 243.0/128.0 


We want to preservethe reference A440 Hz and use itto find thefrequency of the lowest degree 
of the scal e, as we’ve done for Pythagorean and natural scales. B ut which of the 22 degrees shoul d 
correspond to A440? The sruti scale contains both the Pythagorean major sixth (27/16) and the 
natural major sixth (5/3). Let’s choose the simpler 5/3 ratio at degree 17 to correspond to A440. 
Then the lowest degree of the sruti scale has the same frequency as the natural chromatic middle 
C, 264.0 Hz. 

Real R = 440.0; 

Real srutiRef = R * 3.0/5.0; // 264.00 Hz 

Next, we must inform pitch of how many degrees there are per octave, which we can do by 
f¡ ndi ng the I ength of the I i st of rati os: 

SetDegrees(Length(srutiScale)); // set number of degrees in scale 

The built-in SetDegrees ( ) function adjusts the ¡nternal calculations of pitch to the spec- 
ified number of degrees in the scale. To keep things simple, the degrees of the sruti scale are indi- 
cated only by their degree numbers, rather than by trying to extend the Western pitch-naming 
System. Then the frequencies of particular sruti degrees are computed as follows: 

For (Integer i = 0; i < Length(srutiScale); i = i + 1) { 
pitch x( i, o, 4 ); // pitch, accidental, octave 
Real Á^pitchToHz (x, srutiRef, sruti) ; 

Print(f); 


which prints the frequencies of the sruti scale from middle C as follows: 

1 2 3 4 5 6 7 8 9 10 11 

264.00 278.12 281.60 293.33 297.00 312.89 316.80 330.00 334.13 352.00 356.40 

12 13 14 15 16 17 18 19 20 21 22 

371.25 375.89 396.00 417.19 422.40 440.00 445.50 469.33 475.20 495.00 501.19 

Other scales, such as Partch’s scale and the quarter-tone scale, can be constructed in the same 
manner. The Bohlen-Pierce scale can al so be constructed this way because the SetDegrees o 
function only specifi es the number of degrees i n the scal e and makes no assumptions about octave 
equival ence. 



Appendix B 


B.2.2 Rhythm 

Durationincommonmusicnotationisexpressedasafractionofawholenote. Forexampleawhole 
note equals four quarter notes: 

„=J+J+J+J 

Wecould writethis mathematically as follows: 


which suggests using rational fractions to represent rhythmic durations. A rational fraction is a 
ratio of integers. M usimat comes with a built-in data typecalled Rhythm, which emulates com- 
mon musical notation conventions regarding rhythm. For example, the quarter note is defined as 

Rhythm Q = Rhythm( 1, 4 ); 

The first number is the numerator of the rational fraction, the second is the denomi nator. N ote that 
wecan'twriteRhythm(i/4) becausethe i ntegerquotient of 1/4 is 0 with a remainder of 1; integer 
división is performed if both the numerator and denomi nator are integers, which won’t work here. 
Specifying the numerator and denominator separately avoids this problem and has some other 
numerical advantages as well. Executing Print (Q); prints (i, 4). I nternally, Rhythm o 
keeps the integer numerator and denominator valúes separately. 

Rhythmic duration can also be given as a real expression. Print (Rhythm(o.5)); prints 
(i, 2). FI ow does Rhythm () convert thi s real expressi on i nto a rati o of i ntegers? 11 does so by 
cali i ng the following function internally: 

realToRational(Real f, Integer Reference num, Integer Reference den) { 
Constant Integer iterations = 30C0C0Q; 

Constant Real limit ¥■ $.000000000001}- 
num = den = 1; // start o£Í with ratio of 1/1 
,'for (Idt;ege¿.-S.!" 0; 1 < iteráífcibns; 1 = i. | 1) { » 
ipjT (RealAbs( Real(num) / Real(den) - f ) < limit) 

Return; // we have reached the limit 
Else { 

Real x = RealAbs(Real(num+1) / Real(den) - f); 

Real y = RealAbs(Real(num) / Real(den+1) - f); 

If (x < y) 


Else 


den = den + 1; 



446 


Appendix B 


} 


Return; // if we get here, we've not converged on the limit 
} // RealAbsO is just a versión of Abs() tfsaí, jjses Real arithmetic 

Function reaiToRationai o takes a Real valué f and attempts to find a rational fraction 
num/denthatisascloseaspossibleto it. Itstartsby setting num = den =? § and askingwhether 
num/den is al ready cióse enough to f. If so, it returns. Otherwise, itaskswhether (num+i) / den 
iscloserthannum / (den+i). If so, ¡tincrernentsnum; otherwise it increments den and repeatsthe 
process. Because num and den areReference arguments, any changesto these variables withi n 
reaiToRationai o are reflected in the valué of the actual arguments supplied to it. 

This method can be used to find rational approximationsto most any real valué. For example, 

Real Pi = 3.14159265; 

Print(Rhythm(Pi)); 

prints (1953857, 621932) . Note that 1953857/621932 = 3.1415 92 65, W h¡ Ch i S 

pretty cióse to the valué of jt. This method is limited by the precisión of the Computer hardware. 
The precisión of a rational approximation depends upon the valué of the built-in variables 
iterations and limit. For example, with the valúes shown in the preceding code, it took 
reaiToRationai o 2,575,787 triáis to come up with its best approximation of n, requiring 
about 1 second on my Computer. The iteration and ííaat parameters can be set to whatever 
valúes produce the optimal performance/accuracy cost/benefit ratio. Barring obscure rhythms 
(nothing, say, beyond tripleteights), iterations = 240 and limit = 1.0/480 should be 
satisfactory. 

A Ithough the detai Is go beyond the scope of this book, 5 here is a sketch of how Rhythm () uses 

realToRat:oral(): 

Rhythm(Real x) { 

.Trttege r num, den; // '. jlljterhsl parameters for Rhythm 

realToRatlonal ( x, num, den ); // convert x to num/ den rationa¿^|"a.etion 

// . . . 

When cal led with a Real argument, Rhythm o calis reaiToRationai o to set its internal integer 
rational fraction valúes. 

M usimat provides built-in definitions for standard binary divisions of a whole note: 

Cor.stsint Rhythm W = Rhythm(1.0/1.0) ; 

Constant Rhythm H = Rhythm(1.0/2.0); 

Constant Rhythm Q = Rhythm(1.0/4.0); 

Constant Rhythm E = Rhythm(1.0/8.0); 

Constant Rhythm S = Rhythm(1.0/16.0); 















Appendix B 


447 


11 i s easy to defi ne ternary di vi si ons as wel I. F or exampl e, a tri pl et ei ghth is Rhythm < i. o / 12 . o) 
because there are 3 ■ 4 = 12 tri plet eighths per whole note. By the same reasoning, a quintuplet 
eighth iSRhythm<i.o/20.0). 

Wecan expresscompound rhythmsby addition. For example, adotted half notéis 

Real Hd = Rhythm(1.0/2.0 + 1.0/4.0); // dotted half 

Equivalently, 

Real Hd = Rhythm(3.0 / 4.0); // also a dotted half 

Wecan also do arithmetic directlywith rhythms. For example, Print (e+s) prints (3, 16). 
Also, Print(W - S) prints (15, 16), Print (Q * S) prints ( 1, 64 ), and Print (Q / S) 

prints (4,1). The last valué corresponds to a duration of four whole notes. 

Wecan extractthe numerator and denominatorfrom Rhythm o : 

Integer num, den; 

Rhythm (E+S, num, den); // assigns rational fraction for E+S to num and den 

U sed thisway, Rhythm o calculates the rational fraction of itsfi rstargument and sets num and den 
by reference to the result. For the preceding example, num is set to 3 and den is set to 16. We 
can leverage this capability to obtain the duration of a rhythm as a real number: 

Real realRhythm(Rhythm x) { 

Integer num, den; 

Rhyftísiii (x,; den) ; // J||igid rational fraction for x and set Úlinrand den 

Return (Real (num)/Real (den) ) ; // conveít and den to reais and divide 

} 

Then, for example, executinglifnt (realRhythm (e + s)); prints o. 1875. 

Aswith pitches, wecan makelistsof rhythms. 

RhythínEist R = {Q, E, E, E, S, S, Q}; 

Print(R); 

prints { (1,4), (1,8), (1,8), (1,8), (1,16), (1,16), (1,4)}. 

B.2.3 Tempo 

In common music notation, tempo is expressed using M álzel’s metronome markings (see sec- 
tion 2.6.2). For example, J = 60M M indicates that the beat or pulse of the music is associated 
with quarter notes and that there are 60 beats per minute. Thus atj = 60M M each quarter note 
lasts 1 second, and at J = 120M M each quarter note lasts 0.5 second. Thus tempo scales the 
durations of rhythms. 





Appendix B 


We can emulate this by calculating a tempo factor based on M álzel's metronome markings. 
Rhythmsarethen multiplied by this coefficientto determine their actual duration. First, weneed 
a function that calculates the tempo factor: 

Real mm(Real beats. Real perMinute) { 

Return(l.Q / (4.0 * beats) * 60.0 / perMinute); 

} 

The beats argumentistherhythmic valué that gets the beat, and the perMinute argumentisthe 
number of beats per minute. For example, 

Real tempoScale: = mm( Q, 60,0 ); // 60 quarternotes per minute 
sets tempoScale to i . o, and 

Real tempoScale mm(Q, 120.0); // 120 quarternotes per minute 

sets tempoScale to 0.5. Scaling a list of rhythms with tempoScale adjusts them to the pre- 
vailing tempo. Start with a rhythm list. 

RhythmList T = {Q, E, E, E, S, S, Q}; 

Print(T) ; 

prints {(i, 4), (i,8), (i,8), (i,8), ( 1 , 16 ), ( 1 , 16 ), (i, 4)}. Now scale it. 

RhythmList S = T * tempoScale; // tempoScale 
Print(S); 

printS { (1, 8) , (1,16), (1, 16), (1, 16), (1,32), (1,32), (1,8)}. 

Though this explicit approach to managing tempo works fine, in fact Rhythm o has this cal¬ 
cularon convenientlybuiltin. Itworksinconjunction with a built-in function named setTempo o 
that i mplicitly scales all rhythmic durations by the specified tempo factor. So, for example, given 
the precedí ng definition of RhythmList t, 

SetTempo(mm(Q, 90)); // set tempo to 90 quarternotes per minute 
Print(T); 

prints 6), (1/12 ), |t f i 2 ), ( 1 , 12 ),, ||*i4) , t|,24) , ( 1 , 6)}. All rhythmic valúes 
arescaled implicitly by Rhythmo. 

B.2.4 Loudness 

Loudness is expressed in common music notation using performance indications such as fortis- 
simoorpiano (seesection 2.7). Buttheperformed intensity depends upon theacoustical power of 
the instrument and the interpretaron of the performer. A better approach for the purpose here 
would beto define loudness in objective terms using decibels (see section 5.5.1). 



Appendix B 


Since microphones and loudspeakers measure and reproduce pressure waves, it is common to 
usedB SPL in audio work (seeequation 5.32). It is al so conventional in audio to takethe loudest 
valuethatcan bereproducedwithoutdistortionasareferenceintensity ofO dB (seesection4.24.2). 
Sincemeasured intensitieswilI belessintensethanthereference,then by thedefinition ofthedeci- 
bel they wi11 be expressed as negative deci bel Ievels. We can write, for example,-6 dB to indícate 
an amplitude that is (very cióse to) one half of the amplitude of the 0 dB reference. 

Restating (5.32), theequation for dB SPL, as 


y dB = 20 log 10 ^ 

and simplifying by letting x =A'IA, wehaveydB =20log 10 (x). Solving forx, wehave 
x = 10y/ 20 . (B.l) 

For exampl e, setting y =-6 dB, wehave x = 1(H /20 = 0.501. The valué ofxis the coefficient by 
which a signal must be multiplied to lower its amplitude by 6 dB. For another example, setting 
y = OdB, we have x = 10 0/2 ° = 1. So multplying by 0 dB does not affect amplitude. Setting 
y =-120 dB, wehave x = io 120/20 = 0.00 0 001, so multiplying a signal by -120 dB renders it vir¬ 
tual ly inaudible. Finally, if wewish to amplify asoftsound, scaling it by +6 dB makesittwiceas 
loud.Thus, scaling soundswith decibel coefficientsallowsusto achievearbitrary loudnesslevels 
for waveforms. So we define 

Real dB(Real y){ 

Retltt.n (Pow (10.0, y/20.0)); 


For example, ? r ; n t (dB (-6) ) prints o . 50ii87, Print (dB (O)) prints i. o, and 
Print (dB ( '.20) ) prints 0.000001. 

Suppose we have the following audio samples for a sound: 

Rea 1 List mySour.d - (0, 0.16, 0.192, -0.37, -0.45, -0.245, -0.43, 0.09, . . 

We wish to halve the sound's amplitude. Then 

RealList scaledSound = mySound * dB(-6); 

Print(scaledSound); 

prints { 0 . 02 , 0 . 08 , o.io, - 0 . 19 , - 0 . 23 , - 0 . 12 , - 0 . 22 , 0 . 05 , . . .}. 

See volume2, chapter 1, for moreabout sampled signáis. 

M usim at provides built-in definitions for standard music dynamies levels based on figure4.7. 

Real ffff = dB(0), fff = dB(-10), ff = dB(-18), f = dB(-24), 
mf = dB(-32), mp = dB(-40), p = dB(-48), pp = dB(-56), 
ppp = dB (-64); 



450 


Appendix B 


TableB.l 

ASCII Character Codes 


EOT ENQ ACK 


DC4 NAK SYN 


Thus ffff does not change the amplitude of the signal, but all others attenuate it to varying 
degrees. 

B.3 Unicode (ASCII) Character Codes 

The U niversal Character Set, or Unicode, encodes virtually all of the world's characters and even 
leaves room for characters notyet invented. A common subset of U nicode isASCI I (A merican Stan¬ 
dard Codefor Information lnterchange),whichwasproposed byANSI in 1963 and adopted in 1968. 
Recent standards that refer to A SC 11 i ncl ude ISO-14962-1997 and A N SI-X 3.4-1986 (R1997). The 
ASCII codeincludesmanypunctuationmarksandwhitespacesuchasblank,tab,and newline(which 
forces subsequent text onto a new line). 

To obtai n the i nteger A SC 11 number correspondí ng to a character, fi rstfi nd the row r and col umn 
c contai ni ng the character i ntableB.l. The ASCII number ofthis character i s 2 r +c. Forexample, 
the character 'A' corresponds to 2 4 +1 = 33. 

The characters between O and 31 and DEL arereserved for functions that mostly don’t concern 
Computer users, except for CR (carriage return) and LF (linefeed). SP stands forthespace 
character". T hi s i s another one of those tables that you must Iearn if you expect your geek fri ends 
to takeyou seriously, so placea copy of table B.l atyour bedsideor above the mantel piece, where 
you can refer to itfrequently. 

B.4 Operator Associativity and Precedencein Musimat 

To keep it simple, the M usim at expressions in this book areformatted to obey simple left-to-right 
evaluation. In fact, the rules are a little more complex because M usimat is basically C++ in 
sheep’sclothing. 




Appendix B 


451 


TableB.2 

Operator Precedence and Associativity 


Operator 

Associativity 

Description 

Examples 

0 

leftto right 

grouping 

a * (x+y) == ax + ay 


rightto left 

negation 

-3 == -i * 3 

* / 

leftto right 

multiplication and división 

a * b, a / b 

% 

leftto right 

remainder after integer división 

10 % 3 == 1, 12 % 3 == 0 

+ - 

leftto right 

addition and subtraction 

a •+ 3b, a - te 

<<=>>= 

leftto right 

less-than, less-than-or-equal, 
greater-than, greater-than-or-equal 

1 > t a ; te 

== ! = 

leftto right 

equal, not equal 

a == b, a != te 

And 

leftto right 

logical AND 

False And False == False 

False And True == False 

True And False == False 

True And True == True 

Or 

leftto right 

logical OR 

False Or. yalse == False 

False Or True == True 

= 

rightto left 

assignment 

a = te, a = te + c 


Associativity of operators is generally left to right, exceptfor assignment and negation. For 
exampletheexpression a = c = dassignsthevaluedtoc, thenassignsctoa, thereby making 
all three haveequal valué. 

Table B.2 shows M usimat's simplified operator precedence and associativity in order from 
highestto lowest. This precedence list is a shortened versión derived from C and C++. Sinceyou 
can't effectively read or write Computer programs unlessyou have memorized these rules of oper¬ 
ator precedence and associativity, experts recommend that you study these tables whi Ie you brush 
your teeth every night (Press et al. 1988, 23). 

Warning: someexpressions thatmight seem to haveself-evidentmeaning can’tbeexpressed as 
suchinC/C-H-andsodon'tworkinMusiMATeither.Taketheexpressiond > b > a,forexample. 
You'd hope it would test whetherb lies between a and c. Alas. Considerthis example: 

*£ (3 > 2 > 1) Prittt ("true") Else Prist ("f alse" ) 

Itfi rstevaluates(3>2), whichitdiscoversisTrue, and replacesthisexpression with theinteger 
1 (which standsfor True inC++). Itthen evaluatestheexpression ( 1 >1) which ¡SFaise. Prob- 
ably not what we wanted. This example can be rewritten as follows: 

If (3 > 2 And 2 > 1) Pr i r.t. ("Lr u<s 0 ) Else Print ("false") 


which will print True. 










Glossary 


A440 The standard of pitch for Western orchestras, corresponding to 440 Hz. 

Acoustics The study of signáis and signaling systems where the médium is air. 

ADSR Segments of the amplitude envelope named for the initial letters of each segment: attack, decay, sustain, and 
release. 

Amplitude Distanceof a wavefrom its peak height to its pointof zero displacement or equilibrium. Also called peak 
amplitude. Peak-to-peak amplitude is the distance from crestto trough. RM S amplitude is theaverage energy of a sinu- 
soid, basedon its amplitude. 

Anechoic chamber A room that is so padded that it produces no echoes, thereby eliminating reverberation; usually 
also isolated from external noise sources. 

Antinode Point where displacement dueto vibration isgreatest. 

Atmosphere Average atmospheric pressure at sea level, with a standardized valué of 101,325 Pa. 

Band A rangeoffrequencieswithinaspectrum. 

Band center Geometric mean frequency of a band. For a band extending from 707 Hz to 1.414 kHz, the band center 
frequency is 1000 Hz. 

Bandwidth Distance between upperand lowerfrequency limitsof asound. 

Beat Fundamental unit of time measurement, correspondíng to the pulse of the music. 

Causal System that referencesonly currentand pastinputand pastoutput. Causal systems may notreferencefuturein- 
put or current or future output. 

Chaotic system A deterministic System that appears to be random such that it is impossible to make long-range pre- 
dictions about its behavior. 

Complex system System that contains elements that are both differentiated (specialized or compartmentalized) and 
integrated (connected or unified) on all levelsof scale. 

Compliance The reciprocal of stiffness. 

Continuousdistribution A distribution where the events in the sample space cannot be individually distinguished. 
Temperature and frequency are examples of continuous distributions. 

Critical bands Channels of frequency-selective psychoacoustic Processing that affect our perception of pitch, loud- 
ness and masking of components lying within a critical frequency distance (roughly 1/3 of an octave) of one another. 
Damping The effect of energy dissipation on a vibrating system. 

Decibel Scale used to measure sound level in sound recording and Communications, based on the same logarithmic 
principie as the Richter scale. 

Degree Individual elementof a scale; also, 1/360 of a circular are. 

Degrees A n ordered set of ñames and positions of the elements of a scale. 

Deterministic Characteristic of systems where every cause has a unique effect. 

Diatonic scale Seven pitches per octave composed of degrees i n the order 2 2 1 2 2 2 1. 

Discrete distribution A distribution where the events in the sample space can be indi vidually distinguished. Tossing 
coins or dice and picking a note on a keyboard are examples of discrete distributions. 



454 


Glossary 


Driven harmonio oscillator A vibrating driving forcé coupled to a driven simple harmonio oscillator, such as a 
spring/mass combi nation. 

Duration In music, the number of beats a note lasts. Generally, the elapsed time of an event. 

Dynamic range Range from the softest to the loudest sound. 

Dynamics A field of classical mechanicsthatstudies how forcé affects motion of material bodiesthroughtime. 
Efficiency The ratio of useful power outputto the total power input. 

E lasticity That property of a material that allows it to restore itself to its original shape after being distorted (stretched, 
compressed, twisted, etc.). 

E nharmonic equivalents C hromatic degrees that sound the same pitch despite having different symbols. 
Enumeration An itemized listof all possibleoutcomes; thesum total of such outcomes. 

E nvelope Characteristic way in which the intensity of a note changes through time. 

Equal-tempered interval The semitone, one twelfth of the pitch distance of an octave, the twelfth root of 2. 
Equilibrium T he State of a System when it has no acceleration; theresultantwhen thesum of all external forcesacting 
on a body is zero and the sum of the momentum of all parts of the system is zero. 

Event The outcome of a random process, such as a roll of the dice. 

Expectation A prediction based on currentand pastexperiences. Seealso Surprisal. 

Formant Group of frequencies of some particular bandwidth that is emphasized by a resonant system. 

Frequency Physical measureof vibrations per second. 

Fundamental Lowest pitched partial in a tone. 

Gamut Entire range of pitches reachable by an instrument or voice. 

Harmonics Frequency components of a complex tone that are positive integer múltiples (greater than 0) of a funda¬ 
mental frequency. 

Harmony In general, any simultaneous combination of tones. More narrowly, an agreeable (consonant) combination 
of tones. 

Harmony theory The art of organizing múltiple concurrent musical lines to reinforce a feeling of harmonic move- 
ment and arrival, suspensión and resolution. 

H eat capacity ratio The ratio of the specific heat of a gas at constant pressure to the specific heat at a constant volume. 
Hertz The unit of one cycle per second, abbreviated Hz. 

H istogram A tabl e of event occurrences. 

Ideal string String that is perfectly flexible, has constant mass per unit length, and is connected to massive nonyield- 
ing supports. 

In phase The State of multipie objects that víbrate with the same speed and direction. 

I nertial reactance The tendency of a mass to resi stchangeinvelocity. 

I nharmonic partíais C omponents that are not i nteger multi pies of a fundamental. 

Interval Difference in pitch between two tones. 

Inversión, oían interval Subtracting an interval from an octave produces its inversión. Intervals of afifth and fourth 
are each other's inversions. 

J ND of loudness Amount by which the intensity of a sound mustchangefortheearto register a difference in loudness. 

J ND of pitch Amount by which the frequency of a sound mustchangefortheearto register a difference i n pitch. 

J ust intervals Intervals madefrom the ratio of small whole numbers. 

Key Thedegreeto which a diatonic scale istransposed. 

Key signature Association between the key (thechromatic degree that the scale startson) and the accidentáis required 
forthe correspondíng diatonic scale. 

Limitof hearing The intensity above which sound isregistered as (possibly damaging) pain. 

Loudness The subjective experience corresponding most closely to sound intensity. 



Glossary 


455 


Mass The quantity of matter contained in an object. 

M atter Anything that occupies space and exhibits inertia. 

Mean freepath The average distance a particlecan move in a gas without a collision; in acoustics, the average dis- 
tance a wave front can travel before being reflected at a wall. 

Melody Notesplayed in sequence. 

M etronome mark I ndication of which duration Symbol gets the beat and how many beats there are per minute. 

M icrotone Scale degree that is lessthan a semitone in pitch. 

M odes Variations of the diatonic scale that preserve interval order but begin from other than degree 1 of the diatonic scale. 
Modulation Changing the effective key signature of a musical work through the introduction of accidentáis not in the 
original key signature. 

M onteCarlo method A ny technique that uses probability to study complex Systems. 

n-limit The highest prime factor of any interval in a musical scale; used asa measureof scale complexity. 

Node A pointwheredisplacementdueto vibration iszero. 

Normal forcé A forcé that is perpendicular to surfaces that are i n contact. 

Note A tone placed in temporal context by an onsettimeand duration. SeeTone. 

Octave Ratio of 2/1 between frequencies; the musical quality of equivalence. 

Octaveequivalence The principie that scale degrees perform the same musical function regardless of the octave in 
which they are played. 

Onset The time when a sound begins; the moment stipulated by thescorefor a note to begin. 

Oscillate To moveor swing regularly and continuously from sideto side. 

Overtones Harmonio components in a tone that are pitched higherthan the fundamental. 

Partials Individual sinusoidsthatcollectively makeupan instrumental tone; also called components. 

Period One complete movement through all the phases of a periodic vibration; for a sinusoid, one period corresponds 
to one complete revolution of a circle. 

Permutation The number of possi ble unique orderings. 

Phase The fraction of a complete rotation through which an object has advanced; characteristic points, such as peaks, 
troughs, and zero-crossings reached periodically each time a wave repeats. 

Phon A measure of equal loudness. See Soné. 

Phon scale A loudness scale that identifies equal loudnessesacrossall perceivable frequencies and intensities. 

Pitch Subjective experience corresponding to the frequency of sounds. 

Polyphony Theartof sounding morethan one musical lineconcurrently. 

Precession time The period required for a higher-frequency vibration to depart from and then retum into alignment 
with a lower-frequency vibration. 

Prime number A n integer that is not divisible by any other number besides itself and 1. 

Probability The relative likelihood of an event, usually expressed as a real number in the range of 0 to 1. 

Probability distribution A function, graph, or listing of the probabilities of the sample space that shows how proba¬ 
bility is distributed among the possible events. 

Programming language A specialized means of describing rule Systems and methods. 

Psychophysics Psychology of perception, focusing on the boundary between physical and psychological phenomena. 
Psychoacoustics ineludes the psychophysics of audition. 

Quality factor The ratio of the resonant frequency to the bandwidth 3 dB down from peak amplitude. 

Random variable Index of a probability distribution function. 

Resonance The tendeney of a system to víbrate sympathetically at a particular frequency in response to energy in- 
duced at that frequency. 

Resonant frequency The frequency that is most effective at enabling a vibrating system to retum to its original energy 
level by dissipation. 



456 


Glossary 


Restoring forcé Interna! forcé that seeksto retum an elastic objectto itsoriginal shape. 

Rhythm That which pertains to the temporal quality of musical notes and phrases. Onset and duration largely deter¬ 
mine rhythm. 

Rubato Gradual perturbations in the tempo. 

Samplespace Thesetofpossibleoutcomes. 

Scale A named, ordered set of pitches, together with a formula for specifying their frequencies. 

Score Combination of notes ordered vertlcally by pitch and horizontally by time. 

Self-similarity Structures that show similarities at all levelsof magnlflcation are self-similar. 

Series Summation of a repeating pattern of terms. A particular ordering of a set. 

Set An unordered collection of any size. 

Set dass A named group of sets that are equivalent under specific conditions. 

Signal A physically detectable quantity such as an acoustical wave that traverses a signaling system. 

Signaling system A system that combines time, space, source, médium, and receiver. 

Silence Sensory percept of the absence of detectable sound intensity at any frequency. 

Simple harmonic motion V ibratory motion in one dimensión caused by the interaction of inertia and elastic torces. 
Soné A measure of comparative loudness. See Phon. 

Sonority The sonic character of a musical interval. 

Sound pressure level A verage pressure variation per unit area. 

Spectrum The range of al I possiblefrequenciesatall possibleintensities. 

Staff Five horizontal lines that serve as a grid indicating pitch range (vertically) and relative note onset (horizontally) 
in common music notation. Attributed to Guido d'Arezzo. 

Standardtemperatureand pressure(STP) Oneatmosphereof pressureatO° Celsius(or 273.15 K). 

Standing waves Waves constrained by wavelength to match the dimensions of physical boundaries. Waves whose 
shape remains constant and only their amplitude changes; waves whose height is scaled through time in the direction 
perpendicular to their length. 

Static equilibrium A system in which the sum of applied torces iszero and does notchange through time. 

Stiffness The ratio of applied forcé to the resulting displacement. 

Surprisal As the probability of an event decreases from 1.0 towardsO, the surprisal goesfrom zero to infinity. See 
Expectation. 

System A combination of interdependent components that can be viewed as a unified whole. Any function that pro¬ 
duces one or more outputs based on zero or more inputs. 

Tempering The practice of adjusting some of the degrees of the scale to irrational valúes so as to fit within an over- 
arching order that is still based on simple integer ratios. 

Tempo Numberof beats per minute. 

Threshold of hearing M inimum amount of sound intensity required for a sinusoid to be detected by a listener in a 
noiseless environment. 

Timbre That which allows us to distinguish notes of equal pitch, loudness, and duration; the ñame of a sound source 
(such as trumpet, violin) or a quality of a sound source (such as Sharp, dull). 

Time signature Stipulation of how many beats there are per measure and which note gets the beat. In 3/4 time, there are 
three beats per measure (indicated by the numerator) and the quarternote gets the beat (indicated by the denominator). 
Tonal palette Coloration based on the placementof various-sized intervals in a scale. 

Tone Combination of pitch, loudness, and timbre. A n ideal tone has constant pitch, loudness, and timbre; convention- 
ally, the term describes any reasonably uniform combination of the three properties. A sound without discernible pitch 
(such as a drum beat) is not a tone. W hen placed in a temporal context, a tone becomes a note. See N ote. 

Tone row A series based on a set of pitch classes. 

Transpose To start a scale on any chromatic degree but C. 



Glossary 


457 


Uniform distribution If all eventsin asamplespaceareequally likely, the resulting distribution issaid to beuniform. 
Unisón 1/1 ratio between frequencies. Tones sounding at the same pitch. The musical quality of identity. 

Wave An organized traveling disturbance in a médium, such as air. 

Waveshape Characteristic intemal organization of a sound wave, responsible for determining the timbre, or sound 
quality of a sound. 

Well-tempered Characteristic of tuning Systems that temper at least some intervals or have reasonably equal-sized 
semitones. 

Wolf fifth Nonharmonic intervals that cause beating between the interval and the overtone series, making it sound un- 
pleasantly likewolves howling. 

Work The forcé applied to movean objecttimes thedistance it is moved. 




Notes 


Preface 


1. F rom a C hiriese fortune cookie opened the night the fi rst page was written. 


C hapter 2 


1. Thisisa bitof an oversimplification. Our experienceof pitch also dependson loudness, among other factors. Forthe 
full story, seesection 6.5.1. 

2. C uriously, the diatonic major scale begins with the letter C, not A. I 've never seen a sensible explanadon for this fact. 

3. The practiceof singing aided by solmization syllableswasdeveloped by Guido D'Arrezzo, a Franciscan monk of the 
tenth century. The practice is cal led solfeggio. 

4. For sometranspositions, it may be necessary to raisea notethat isalready Sharp, henee the double Sharp; similarly, it 
is sometí mes necessary to lower an already fíat tone, henee the double fíat. 

5. Generally, one must study the harmonio semantics of the score to determine whether the major or minor key is 
indicated by the key signature. 

6. 1862-1918. Seeforexample, Debussy’s piano prelude Voiles. 

7. 1917-1982. Monk used whole-tonescalesalmost as a signature in many of hisjazz compositions. 

8. The term overtone generates confusión in numbering. Note that the first partía! is the fundamental, while the second 
partial is the first overtone. Thus, for example, overtone number 10 is partial number 11. To avoid confusión, l'll 
generally avoid the term overtone, preferring partial or component. Since the term partial is primarily an adjective, l'll 
use it only when I think the context is clear. 


C hapter 3 


1. Helmholtz (1863); second English edition (1885), 250. 

2. In 1995 the paleontologist Ivan Turkof theSlovenian Academy of Sciences discovered whatappearsto beafragment 
of aflutemadefrom acavebearthigh bonein a Neanderthal archaeological site. Itwassubsequently radio-carbon-dated 
to be about 43,000 years oíd. There is an ongoing controversy over whether it is a flute or not, and if so, what scale it 
would have played. Whether it is proved or not, itsuggests weshould consider radical ly revi si ng backward in time what 
musicologists refer to as early music. 

3. The cent scale was developed by Alexander El lis, who translated into English Helmholtz'streatiseOn theSensations 
ofTone (1863), oneof thefirstscientific studies of consonance. 

4. The term diatonic originally referred to a scale constructed from two (di a) tetrachords. The tetrachord was a scale 
building block in ancient Greek music theory. 



460 


Notes toPages 47-151 


5. Robert Fludd, History oftheMacrocosm and Microcosm (1617). See Debus (1979) and Godwin (1979). 

6. Ptolemy, "Harmonios/' in Barker, Stevens, and le Huray (1984), 270-360. 

7. 11 is a descending Syntonic comma because the puré major third is smaller than the Pythagorean major third. 

8. A function f(x) issaid to bemonotonic inxif f alwayschangesin thesamedirection asx. 

9. Theequation for thefitted curve isy = 1.9 +0.12x + 0.18x 2 . 

10. Francesco Antonio Vallotti, Trattato delia Scienza Teórica e Pratica della Moderna Música. Conceived in 1728, his 
ideas weren't published until 1779. 

11. J. S. Bach, The Well-Tempered Clavier, comprising two books (1722 and 1744), each having 24 sets of preludes and 
fugues in every major and minor key. 

12. Simón Stevin's Van de Spiegheling der Singconst(On the Theory of the Art of Singing), written ca. 1605, wasfirst 
published in 1884,264 yearsafter he died. Seealso Cohén (1987). 

13. Partch, from the liner notes of his RCA phonograph record Castor and Pollux. 

14. I studied sitar in India with S. Dagar and in the United States with Pandit Nikil Banerjee. 

15. Kees van Prooijen apparently also discovered the tempered versión of this scale in the 1970s. 


C hapter 4 

1. Butthere are some interesting cases where this assumption leads into the weeds (see section 9.17.2). 

2. For example, consider this ratio of small but nonzero valúes: a tenth divided by a billionth. Such a ratio is not a small 
number. 

3. It's important to note that the backward velocity isjustthe velocity between points A and B; it is not about having a 
negative si ope. 

4. This is why "speed kills." Reaction timéis constant, but the time requi red to stop isthesquareof thespeed. 

5. Actually, log (2) = 0.30103 . . . , but the fractional part beyond the tenths position is often ignored for practical 
measurements. 


C hapter 5 


1. If you are uncomfortable with the radian's being a dimensionless number, you probably will seize upon this 
definition of the radian as proof that its dimensión ¡sin unitsof degrees. However, the degreeisalso dimensionless. In 
fact, all angle measures, including trigonometric functions, are dimensionless. Also note that a radian is only 
approximately 57.3°. 

2. The radian was developed by James Thomson in 1873, a professor of mathematics at Queens College, Belfast, 
Northern Ireland. Hisbrother was thefamous physicistWilliam Thomson, Lord Kelvin. 

3. It iscustomary to use t for linear ti me and T for periodic time. 

4. This is the proof that there is no such thing as centrifugal forcé. If there were, and it applied a forcé to the object 
directly away from the axis of rotation, then the object should fly radially away when released, but it does not. Instead, 
circular motion is the vector sum of centripetal forcé and linear velocity. 

5. For example, simple electrical multimeters use this approach when displaying RM S voltage. 


C hapter 6 


1. Bregman (1990) gives a monumental description of the factors involved in constructing auditory scenes. Handel 
(1989) and Yost (2000) provide an easier introduction. 

2. The majority of cues we use for source identificatión lie within thisfrequency band, suggesting that our hearing may 
have adaptively evolved to be more sensitive to it. 



Notes to Pages 152-246 


461 


3. The purpose of the osside Chain in the middle ear as an impedance matching system was first pointed out by 
Helmholtz (1863). 

4. Also called noise-induced temporary threshold shift (NITTS). 

5. These symbols are used because when pronounced, í> (phi) and 4* (psi) sound like the initial syllables of the words 
physical and psychological, respectively. 

6. Blind men, touching various parts of an elephant, report conflicting accounts of their experience depending upon the 
part they touch, then fall into an argument as to whose account is the correct interpretation. The poem by J ohn Godfrey 
Saxe (1816-1887) describing this event concludes, "So oft in theologic wars, / The disputants, I ween, Rail on in utter 
ignorance/Of whateach other mean, /And prateaboutan Elephant/ N otone of them hasseen!” 

7. ErnstWeber (1795-1878). 

8. A third important attribute is accuracy, notto beconfused with precisión. Precisión hasonly to do with thefinenessof 
measure. A ruler with very fine gradations may measure precisely, but if it is warped, it will not measure accurately. 

9. Imagine a point light source positioned on the y-axis above the spiral in figure 6.5, shining down through the coils 
onto the f loor. 

10. The impossible staircase was invented by the Swedish artist Oscar Reutersvard and later independently reinvented 
by Lionel Penroseand Roger Penrose. Itwasmadefamousin M. C. Escher's print Ascending andDescending. 

11. In fact, the Germán organist Georg Andreas Sorge published a description of the same phenomenon in 1744, but 
Tartini's observation is mostfrequently cited. 

12. This is by no means the only possible or the best definition for these terms, but it will serve for this simplified 
example. 

13. I had theprivilegeof being oneof Grey'ssubjects. 


C hapter 7 

1. Since the balls represent packets of air rather than individual molecules, we can ignore the random microscopio 
motion of the individual molecules. 

2. Ludwig Boltzmann, Austrian physicist (1844-1906). 

3. At a great distance from a sound's origin, a listener experienees the waves to be plañe rather than spherical because 
the circumference of the wave front is by that time very large in comparison to the local experience of it. However, the 
total waveisstill actually spherical. Seesection 4.24.4. 

4. A good modern treatment of the subject is given in Sharp (1996). 

5. Since the bars are shorter, they have less mass, but the elasticity of the wire is the same, so the rate of wave 
propagation increases. 

6. Christiaan Huygens, mathematician, physicist, astronomer, lutanist, and music theorist (1629-1695). 

7. Named for the British Astronomer Royal Sir George Bidwell Airy (1801-1892). 

8. Kidsathome: don'ttry this! 

9. For a dramatic telling of thestory, see Bliven (1976). 


C hapter 8 


1. RobertHooke, physicist, biologist, astronomer, and architect (1635-1703). 

2. This expression means "as long as x is much less than I." This restriction prevenís us from having to consider the 
nonlinearvibratory behaviorof pendulums that can swing morewidely. 

3. Hermann von Helmholtz, a scientist whose contributions spanned physics, biology, and acoustics (1821-1894). His 
book On theSensationsof Tone isstill widely referenced. 

4. An explosive and racy Cháteauneuf, itfairly burst with game, berry, black chocolate, and espresso characteristics. 
Ripe and sweet-tasting, it had enough opulent fruit to balance the firm tannin structure, like a rose growing up the 
impenetrable wall of its spectacular finish. (Kidsathome: don'ttry this.) 



462 


Notes toPages 248-348 


5. Even in outer space, the intemal friction of the spring would eventually dissipate all of the system's energy, but we 
ignore this effect aswell. 

6. Pitch is "head over heels" rotation, yaw is spinning side-to-side rotation, and roll is "over your shoulder" rotation. 
Define three axes through your center of gravity asfollows: x is acrossyour body, y is head-to-toe, and zisfrontto back. 
Pitch is rotation in x, yaw is rotation in y, and roll is rotation in z. 

7. The classical guitarist Andrés Segovia used no amplification during concerts, even though excellent sound 
reenforcement was available by the end of his career. But there was no need: his sound adequately reached his thousands 
of listeners, who listened in a hush. True acoustic performance seems like a lost ideal in today's public concerts. 

8. This equation has been attributed to M ersenne (from his "laws of stretched strings" in Harmóme Universelle) and to 
Brook Taylor (1685-1731) in 1714. 

9. Named afterThomasYoung (1773-1829). 

10. Published figures vary from about 69 to 79 for aluminum, so 74 is about in the middle. 

11. Though a flute may look like it's dosed at one end, the fipple of the flute is effectively an opening, so it is open at 
both ends. 

12. The point 3 dB down from the peak energy point is sometimes called the half-power point, a figure used commonly 
for this purpose by engineers, because 3 dB is equal to the square root of 2. 


C hapter 9 


1. Augusta Ada Byron King, Countessof Lovelace, note A, 694, in her notes added to the end of her English translation 
of Luigi F. M enabrea, Notions sur la M achine Analytique de M. Charles Babbage, Bibliothéque Universelle de Genéve, 
41, 352-376. Her translation was published under the pseudonym AAL in Richard Taylor's Scientifíc Memoirs, 3, art. 
29, 666-731, under thetitle "Sketch of the Analytical Engine invented by Charles Babbage, Esq., by L. F. M enabrea of 
Turin, officer of the M ilitary Engineers," A ugust 1843. 

2. The term algorithm derives from the ñame of ninth-century Persian mathematician, geographer, and astronomer, A bu 
J afar M ohammed ibn M usah al-Khorezmi, inventor of modern decimal positional arithmetic and algebra. A l-Khowarizm 
means Citizen of Khowarizm, known today as Khorezm in Uzbekistán. Algorizm, the precursor to the modern term 
algorithm, is a transliteration of the last part of his ñame. His treatise on arithmetic was titled Kitab al jabr 
w'al-muqabala, commonly translated as "Rules of restoration and reduction." The word al-muqabala is the origin of the 
term algebra. 

3. Barbara Cook Loy, prívatecommunication. 

4. The precise relation between rate of change and frequency is developed in volume 2. 

5. Bailey and Crandall (2001). Though the expansión of k appears to be random, this has not been proven. Expansión of 
other irrational numbers, such as e and log 2 might also be random but, again, this has not been proven. 

6. Notice that Lorenz conjectures that a butterfly might "set off" rather than "cause" a tornado. This is an important 
distinction, suggesting that the i ni ti al conditions serve to selectan outcomefrom many possi bilities. 

7. An interesting paradox in mathematics concerns the cardinality of the set of points on a line. Georg Cantor 
established that C, the cardinality of all real numbers (correspondíng to the number of points on a line), is greater than x 0 , 
the cardinality of all integers. But how much greater isC than x 0 ? In particular, is there a transfinite number between x 0 
and C? Cantor's continuum hypothesis States that there is no such transfinite number. However, it has been demonstrated 
that the val idity of the continuum hypothesis isundecidable. Using the standard axiomsof set theory, KurtGódel showed 
that the continuum hypothesis is impossibleto disprove. Later, Paul Cohén showed that it is impossible to prove under 
the same conditions. Henee, the continuum hypothesis is independent. The independence of the continuum hypothesis 
has been taken as an exhibit of Gódel's incompleteness theorem, because it is an important question that has been proven 
to be undecidable, even though the proofs are based on the standard and universally accepted axioms of mathematics. 

8. The midpointof an 88-key keyboard is between E andF above middleC. 

9. The analogy between entropy and information has been criticized by some physicists. There are implications in the 
equation for entropy that are not matched for information. However, this dispute need not concern us here: the analogy 
between information and entropy has become a fixture in the literature. 

10. After R. V. Hartley, who in 1927 proposed using logarithmsto measure information. 

11. Aristoxenus, "The Harmonios," in M aeran (1902), 27-30. 



N otesto Pages 353-446 


463 


12. A mathematical construction iscalled pathological if it iscreated simply to invalídatean otherwise universally valid 
assertion. 

13. Plato, Laws, bk 49, in Pangle (1980). 

14. J. S. Bach, 389 Choralgesangefür Vierstimmigen Gemischten Chor. Nr. 3765. Breitkopf Edition. 

15. HerbertBielawa, privatecommunication. 

16. For example, "upathird" isfrom aC chord to an E chord; "down a fifth" isfrom aG chord to aC chord, and so on. 

17. Dolson (1989). I am indebted to this articlefor its intuitive explanation of back propagation. 

18. J. S. Bach, 389 Choralgesangefür Vierstimmigen Gemischten Chor. Nr. 3765. Breitkopf Edition. 

19. Haus and Sametti (1991, 7). The multiplicity extensión is a partial implementation of self-modifying nets, which 
were introduced by Valk (1978). 

20. J. S. Bach, "Canon Perpetuus" from A Musical Offering. BWV 1073. London: Boosey and Hawkes, 1952. 

21. See, for example, Harel (1987), an importantearly theoretical paper. For more recent practical developments, see, 
for example, Samek (2002). 

22. Landon (1976, 508-509). Leopold Mozart quoted Haydn's comment in a letter to his daughter. The encounter 
transpired after Haydn heard Mozart's Bb Maj. Quartet K456, "The Hunt" in 1785. The phrase "knowledge of 
composition," kompositionswissenschaft, means literally "composition Science." 

23. Flavius Magnus Aurelius Cassiodorus, Senator (ca. 485-ca. 575), Institutiones, II, iii, paragraph 21, in Strunk 
(1950). 


Appendix B 

1. A simple M USIMAT emulator written in C++ isavailable at http://www.musimathics.com/. 

2. Prior to the 1940s, when someone said "Computer," they typically referred to a person who performed computations 
manually or with the aid of a calculating machine. Itwas not until the 1950s that "robot brains” began to supplant human 
computers. 

3. We can also exit a Repeat statement with a Hetum statement. 

4. Functions of the same ñame that vary in the number or type of arguments or type of retum valué are said to be 
polymorphic. M usimat manages to keep the various versions sepárate from each other and to use the correct one in 
every instance. 

5. M usimat source code is available at http://www.musi mathics .com/. 




References 


A lien, J. B„ and S.T. Neely. 1997. "Modeling the Relation Between the I ntensity J N D and Loudnessfor Puré Tones and 
Wide-Band Nois e." Journal of the Acoustical Society of America 102 (December): 3628-3646. 

Ames, Charles. 1983. "Stylistic Autómata in Gradien!" CMj: Computer MusicJournal 7 (4): 45. 

-. 1987. "Automated Composition in Retrospect: 1956-1958." Leonardo 20 (2): 169-185. 

ANSI (American National Standards Institute). 1999. Acoustical Terminoiogy. ANSI SI. 1-1994 (R1999). 

Antoni, Giovanni degli, and Goffredo Haus. 1982. "Music and Causal ity." \n P roceedings of the International Computer 
Music Conference, 279. 

A peí, W i 11 i. 1944. Harvard Dictionaryof Music. Cambridge, Mass.: Harvard University Press. 

Ashmore, J. F. 1987. "A Fast Motile Response in Guinea-Pig Outer Hair Cells: The Cellular Basis of the Cochlear 
A mplifier." y ourna/ ofPhysiology 388: 323-347. 

A tal i, Jacques. 1985. Noise: ThePolitical Economy of Music. M inneapolis: University of M innesota Press. 

Backus, John. 1961. "Pseudoscience in M usic ."Journal of Music Theory 55: 220-232. 

Bailey, D. H., and R. E. Crandall. 2001. "On the Random Character of Fundamental Constants." Experimental 
Mathematics 10 (June): 175-190. 

Barbour, J. M urray. 1947. "Bach and the Artof theTemperament." Musical Quarterly 33 (January): 64-89. 

-. 1953. Tuning and Temperament: A Histórica! Survey. East Lansing: M ichigan State College Press. 

Barker, Andrew, John Stevens, and Peter le Huray, eds. 1984. Greek Musical Writings. Vol. 2: Harmonic and Acoustic 
Theory. Cambridge: Cambridge University Press. 

Barnes, John. 1979. "Bach's Keyboard Temperament: Internal Evidencefrom the Well-Tempered Clavier." Early Music 
7 (April): 236-249. 

Beckman, Petr. 1976. A HistoryofPi. New York: St. M artin's Press. 

Békésy, G. von. 1960. Experiments in Hearing. New York: McGraw-Hill. 

Benade, Arthur H. 1973a. "The Physicsof Brasses.” Scientifíc American 229 (July): 24-35. 

-. 1973b. TrumpetAcoustics. Cleveland: Case Western Reserve U niversity. 

Benade, Arthur H„ and J. S. Murday. 1967. "Measured End Corrections for Woodwind Tone Holes." ¡ournal of the 
Acoustical Society of America 41:1609. 

Beranek, L. L. 1962 .Music, Acoustics, andArchitecture. New York: Wiley. 

-. 1986. Acoustics. Rev. ed. Melville, N.Y.: American Institute of Physics. 

Bismarck, G. von. 1974. "Sharpnessasan Attributeof the Timbre of Steady State Sounds." Acústica 30:159. 

Bliven, Bruce, Jr. 1976. "Annalsof Architecture—A BetterSound." New Yorker, November8, 51-135. 

Bobrow, D. G., and D. A. Norman. 1975. "Some Principies of Memory Schemata." In Representation and 
Understanding: Studies in CognitiveScience. Ed. D. G. Bobrow and A. M . Collins. New York: Academic Press. 

Bohlen, Heinz. 1978. "13 Tonstufen in der Duodezime." Acústica 39. English trans.: "13 Tone Steps in the Twelfth." 
Acústica 87 (2001, no. 5): 617-624. 



Refer enees 


Boon.Jean Pierre, and Olivier Decroly. 1995. "Dynamical Systems Theory for Music Dynamics." Chaos 5: 501-508. 
Bosi, Marina, and Richard E. Goldberg. 2003. Introduction to Digital Audio Coding Standards. Dordrecht, The 
Netherlands: Kluwer. 

Bregman, AlbertS. 1990. AuditorySceneAnalysis: The Perceptual Organizador ofSound. Cambridge, Mass.: M IT Press. 
Brün, Herbert. 1970. "From M usical Ideas to Computers and Back." In The Computer and Music. Ed. Harry Lincoln. 
Ithaca, N.Y.: Cornell University Press. 

Buchner, Alexander. 1956. Mechanical Musical Instruments. Trans. Iris Urwin. London: Batchworth Press. 

Burkert, Walter. 1972. Lore and Science in Ancient Pythagoreanism. Trans. Edwin L. Minar Jr. Cambridge, Mass.: 
Harvard University Press. 

Cage.John. 1961. Silence. Cambridge, Mass.: M IT Press. 

Cohén, Alexander, J. Anticaglia, and H. H. Jones. 1970. "Sociocusis: Hearing Loss from Nonoccupational Noise 
Exposure." Sound and Vibration 4 (November): 12-20. 

Cohén, H. F. 1987. "Simón Stevin's Equal División of theOctave." AnnalsofScience 44 (5): 471-488. 

Cope, David. 1996. Experiments in Musical Intelligence. M iddleton, Wisc.: A-R Editions. 

-. 1999. Virtual Mozart. CRC 2452. Baton Rouge, La.: Centaur Records. 

-. 2001. Virtual Music. Cambridge, M ass.: M IT Press. 

Cowell, Henry. 1930. New Musical Resources. New York: Knopf. 

Dalmont, J. P., C. J. Nederveen, and N. Joly. 2001. "Radiation Impedance of Tubes with Different Flanges: Numerical 
and Experimental I nvestigations." Journal ofSound and Vibration 224 (3): 505-534. 

Debus, Alien G. 1979. RobertFIudd and His Philosophical Key. New York: Neale Watson Academic. 

Devlin, Keith. 1994. M athematics: The Science ofPatterns. New York: Scientific American Library. 

Dolson, Mark. 1989. "M achine Tongues XII: Neural Networks." CMJ: Computer Music Journal 13 (3). Also in Music 
and Connectionism. Ed. PeterTodd and Gareth Loy. Cambridge, M ass.: MIT Press, 1991. 

Drew, David. 1954/1955. "Messiaen: A Provisional Study. TheScore, December 1954, 33-49; September 1955, 9-73; 
December 1955, 41-61. 

Ebcioglu, Kemal. 1986. "An Expert System for Harmonizing Chórales in the Style of J. S. Bach." Ph.D. diss., 
Department of Computer Science, State U niversity of New York, Buffalo, New York. Also in Understanding Music with 
Al: Perspectives on Music Cognition, 294-334. Ed. M. Balaban, K. Ebcioglu, and O. Laske. Menlo Park, Calif.; AAAI 
Press, 1992. 

-. 1988. "An Expert System for Harmonizing Four-Part Chórales." CMJ: Computer MusicJournal 12 (3): 43-51. 

Erickson, R. F. 1975. "TheDARMS Project: A Status Report." Computers and the Humanities 9 (6): 291-298. 

Euler, Leonhard. 1766. "Conjecturesurlaraisondequelques dissonancesgeneralement recues dans la musique[Conjecture 
as to why dissonant tones are generally heard in music]." In Memoires de l'academie des Sciences de Berlín 20 (1766); 
165-173. Also in Opera Omnia Ser. 3, Vol. 1,508-515. 

Eves, H., ed. 1972. Mathematical CirclesSquared. Boston: Prindle, Weber and Schmidt. 

Eyring, C. F. 1933. "Methods of Calculating the Average Coefficient of Sound A bsorption.” J ournal of the Acoustical 
Societyof America 4:178-192. 

Falconer, K. 1990. Fracial Ceometry. New York: Wiley. 

Fletcher, H. 1940. "Auditory Patterns." Reviews of M odern Physics 12: 47-65. 

Fletcher, H„ and W. J. Munson. 1933. "Loudness, Its Definition, Measurement, and Calculation." Journal of the 
Acoustical Societyof America 5: 82-108. 

Forte, Alien. 1973. The Structure of Atonal M usic. New H aven: Yale University Press. 

Freyd, JenniferJ. 1987. "Dynamic Mental Representations." Psychological Review 94 (4): 427-438. 

-. 1993. "FiveHunchesabout Perceptual Processesand Dynamic Representations." In Attention and Performance 

XIV: Synergies in Experimental Psychology, Artificial Intelligence, and Cognitive Neuroscience. Ed. D. Meyer and S. 
Komblum. Cambridge, M ass.: M IT Press. 

F ux, J ohannesj. 1725. Gradus ad Parnassum (Steps to Parnassus: The Study ofCounterpoint). Trans. A Ifred M ann with 
John St. Edmunds. New York: W. W. Norton, 1943. 



Galilei, Galileo. 1623. TheAssayer. Rome. 

-. 1638. "Two New Sciences Including Centers of Gravity and Forcé of Percussion.” Trans. Stillman Drake in 

Galileo Galilei: Two New Sciences, 96-108. Madison: University of Wisconsin Press, 1974. Trans. Henry Crew and 
Alfonso deSalvio in Dialogues Concernlng Two New Sciences. New York: M cGraw-Hill, 1963. 

Galilei, Vincenzo. 1581. Dialogo della Música Antica e Moderna. Florence: Marescotti. Trans. Claude V. Palisca as 
Dialogueon AncientandModern Music. New Haven: Yale University Press, 2003. 

Gardner, Martin. 1978. "Mathematical Games." Scientifíc American 238 (M arch). 

Genrich, H. J., and K. Lautenbach. 1981. "System Modelling with High-Level Petri Nets." Theoretical Computer 
Science 13:109-136. 

Gerigk, H erbert. 1934. "Würfelmusik." Zeitschriftfür Musikwissenschaft 16 (7/8): 359-363. 

Gilí, Stephen. 1963. "A TechniquefortheComposition of M usic in a Computer.” Computer Journal 6 (2): 129. 

Godwin, Joscelyn. 1979. Robert Fludd, Hermetic Philosopher and Surveyor of Two Worids. London: Thames and 
Hudson. 

Goldstein, J. L. 1973. "An Optimum Processor Theory for the Central Formation of the Pitch of Complex Tones." 
Journal oftheAcoustical Societyof America 54:1496-1516. 

Greenwood, D. D. 1961a. "Auditory Masking and the Critical Band ."Journal oftheAcoustical Society of America 33: 
484-501. 

-. 1961b. "Critical Bandwidth and the F requency Coordinatesof the Basilar M embrane.” ¡ ournal oftheAcoustical 

Societyof America 33 (4): 1344-1356. 

Grey, John M . 1975. "An Exploration of M usical Timbre.” Ph.D. diss., Center for Computer Research in M usic and 
Acoustics/Departmentof M usic, Stanford University, Stanford, California. STAN-M-2. 

Guinan, J. J„ and W. T. Peake. 1967. "M iddle Ear Characteristics of Anesthetized Cats.” Journal of the Acoustical 
Societyof America 41:1237-1261. 

Haas, H. 1951. "Über den Einfluss eines Einfachechos auf die Horsamkeit von Sprache." Acústica 1: 49-58. Trans. as 
"The Influence of a Single Echo on the Audibility of Speech ."Journal of the Audio Engineering Society 20 (1972): 
146-159. 

Hall, Donald E. 1980. Musical Acoustics: An Introduction. Belmont, Calif.: Wadsworth. 

Handel, Stephen. 1989. Listening. Cambridge, Mass.: M IT Press. 

Harel, D. 1987. "Statecharts: A Visual Formalism for Complex Systems.” Science of Computer Programming 8: 
231-274. 

Haus, Goffredo, and A. Rodríguez. 1993. "Formal M usic Representation, a Case Study: The M odel of Ravel's Bolero by 
Petri Nets." InM usic Processing, 165-232. Ed. Goffredo Haus. M iddleton, Wisc.: A-R Editions. 

Haus, Goffredo, and Alberto Sametti. 1991. "ScoreSynth: A System for the Synthesis of M usic Scores Based on Petri 
Nets and a M usic Algebra." IEEE Computer (J uly): 56-59. 

Hayburn, Robert F. 1979. Papal Legislation on SacredMusic. Collegeville, M inn.: Liturgical Press. 

Hayes, William. 1751. “The Artof Composing M usic by a M ethod Entirely New, Suited to theM eanestCapadty.” Cited 
in R. K. Zaripov, “Cybemetics and M usic” Perspectives of New M usic 7 (1969, no. 2): 115-154,120. 

Hellegouarch, Y ves. 2002. “A M athematical Interpretation of Expressive Intonation.” In Mathematics and Art: 
M athematical Visualization in Art and Educatlon. Ed. Claude P. Bruter. New York: Springer-Verlag. 

Helmholtz, Hermann. 1863. On the Sensations of Tone. 2d English ed., 1885. Trans. Al exanderj. Ellis based on the 4th 
Germán ed., 1877. New York: Dover, 1954. 

Hild, Hermann,Johannes Feulner, and Wolfram M enzel. 1991. “HARMONET: A Neural Netfor Harmonizing Chórales 
in the Styleof J. S. Bach.” In Proceedings of Conference on Neural Information Processing Systems. 

Hiller, Lejaren, and Leonard Isaacson. 1959. Experimental Music. New York: M cGraw-Hili. 

Holmes, Bob. 1997. “Réquiem fortheSoul.” New Sclentist, A ugust 9. 

Hoos, H. H., K. A. Hamel, K. Renz, and J. Kilian. 1998. “The GUIDO Music Notation Format: A Novel Approach for 
Adequately Representing Score-Level M usic.” In Proceedings of the International Computer M usic Conference, 451-454. 
Houtsma, A.J .M . and J. L. Goldstein. 1972. "The Central Origin of the Pitch of Complex Tones: Evidencefrom M usical 
Interval Recognition/'Journa/ oftheAcoustical Societyof America 51: 520-529. 



Refer enees 


Hurón, David. 1991. "Tonal Consonance versus Tonal Fusión in Polyphonic Sonorities." Music Perception 9 (2): 135-154. 
Huxley, Aldous. 1928. PointCounter Point. New York: Harper. 

ISO (International Organization for Standardization). 1987. Standard 226. http://www. iso.org/. 

-. 1975. Standard 532A. http://www. iso.org/. 

James, Jamie. 1993. TheMusic of theSpheres. New York: Grave Press. 

J ung, C. G. 1962. Memories, Dreams, and Reflections. Ed. A niela J affé. Trans. R. Winston and C. W instan. New York: 
Random House. 

Jungleib, Stanley. 1996. General MIDI. M iddleton, Wisc.: A-R Editions. 

Kachar, B„ W. E. Brownell, R. Altschuler, and J. Fex. 1986. "Electrokinetic Changes of Cochlear Outer Hair Cells." 
N ature 322: 365-368. 

Kameoka, A., and M. Kuriyagawa. 1969a. "Consonance Theory, Part I: Consonance of Dyads." Journal of the 
Acoustícal Society of America 45 (6): 1451-1459. 

-. 1969b. "ConsonanceTheory, Part II: Consonance of Complex Tones and Its Computation M ethod." Journal of 

the Acoustícal Society of America 45 (6): 1460-1469. 

Kay, M ichael. 1996. "Did Mozart UsetheGolden M ean?" American Scientíst, M arch/A pri I. 

Keislar, Douglas. 1988. "History and Principies of Microtonal Keyboard Design.” Department of Music, Stanford 
University, Stanford, California. STAN-M-45. 

Kellner, Herbert A. 1979. "A Mathematical Approach ReconstitutingJ. S. Bach's Keyboard Temperament.” Bach: The 
QuarterlyJournal ofthe Riemenschneider Bach Institute 10 (October): 2-8. 

Knuth, Donald E. 1973. The Art of Computer Programming. Vol. 1: Fundamental Algorithms. Vol. 2: Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley. 

Kóchel, Ludwig von. 1862. Works of Mozart. 6th ed. Ed. Franz Giegling, Alexander Weinmann, and Gerd Sievers. New 
York:C.F. PetersCorp., 1964. 

Koenig, Gottfried M. 1970. "Project 1." Electronic Music Reports 1 (2): 32. 

Kohonen, Tuevo. 1989. "A Self-Learning Musical Grammar, or Associative Memory of the Second Kind." In 
ProceedingsofthelnternationalJointConference on Neurai Networks, Washington, D.C. 

Koza.John R. 1992. GeneticProgramming. Cambridge, Mass.: MIT Press. 

Kruskal, J. B. 1964. "Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis." 
Psychometñka 29:1-27. 

Kuttner, Fritz A. 1975. "Prince Chu Tsai-Yu's Life and Work: A Reevaluation of His Contribution to Equal 
Temperament Theory.” Ethnomusicology 19 (2): 163-206. 

Landon. H. C. Robbins. 1976. Haydn: Chronicleand Works. Vol. 2. Bloomington: Indiana University Press. 

Langer, S. 1953. Feeling and Form. New York: Philosophical Library. 

Lazer, A. C., and P. J. McKenna. 1990. "Large Amplitude Periodic Oscillations in Suspensión Bridges: Some New 
Connections with Nonlinear Analysis.” SIAM Review 32 (4): 537-578. 

Lentz, Donald. 1961. Tones and Intervals of Hindú Classical Music. Lincoln: University of Nebraska. 

Licklider, J.C.R. 1951. "Basic Correlates of the Auditory Stimulus.” In H andbook of E xperimental Psychology. Ed. S. S. 
Stevens. New York: Wiley. 

Ligeti, Gyorgi. 1965. "Metamorphosesof Musical Form.” DieReihe 7: 5-19. 

Lorenz, Edward. 1972. "Predictability: DoestheFlapof a B utterfly’s Wingsin Brazil Set off a Tornado i n Texas?" Paper 
presentad at A merican A ssociation for the A dvancement of Science conference, December. 

Lowman, E. L. 1971. "Some Striking Proportions in the M usic of Béla Bartók.” Fibonacci Quarterly 9 (5): 527-528, 
536-537. 

Loy, Gareth. 1985. "M usicians Make a Standard: The MIDI Phenomenon." CMJ: Computer Music Journal 9 (4): 8-26. 
Macran, H. S. 1902. The H armonios of Aristoxenus. New York: Oxford University Press. 

Majernick, V., and J. Kaluzny. 1979. "On the Auditory Uncertainty Relations.” Acústica 43:132. 

M andelbrot, Benoit B. 1977. The Fractal Geometry ofNature. New York: W. H. Freeman. 



Mathews, Max V., and J. R. Pierce. 1980. "Harmony and Nonharmonic Partíais." journal of the Acoustical Society of 
America 68:1252-1257. 

-. 1989. "The Bohlen-Pierce Scale." In CurrentDirections ¡n Computer Music Research. Ed. M. V. Mathews and 

J. R. Pierce. Cambridge, Mass.: MIT Press. 

Mathews, Max V., L. A. Roberts, and J. R. Pierce. 1984. "Four New Scales Based on Nonsuccessive-Integer-Ratio 
Chords ."Journal of the Acoustical Society of America 75 (S10A). 

Mathews, Max V., and L. Rossler. 1968. "Graphical Language for the Scores of Computer-Generated Sounds.” 
Perspectives of New Music 6 (2): 92. 

M cClelland, J. L., D. Rumelhart, and G. E. Hinton. 1986. "The Appeal of Paral leí Distributed Processing." In Parallel 
Distributed Processing: Exploratlons in the Microstructure of Cognltion. Ed. D. Rumelhart and J. L. McClelland. 
Cambridge, Mass.: M IT Press. 

McKenna, P. J. 1999. “Large Torsional Oscillations in Suspensión Bridges Revisited: Fixing an Oíd Approximation." 
American Mathematical Monthly 106 (January): 1. 

Mersenne, Marin. 1635. Harmonie Universelle, Contenantla Theorie etia Practique de la Musique, ou il est Traite de 
Consonances, des Dissonances, des G enres, des M odes, de la Composition, de la Voix, des Chants, et de Toutes Sortes 
d'lnstruments Harmoniques. Paris: Pierre Ballard. Facsímile ed. Francois Lesure. París: Editions du Centre National de 
Recherche Scientifique, 1965. Trans. Roger E. Chapman as Harmonie Universelle: The Books on Instruments. The 
Hague: Nijhoff, 1957. 

M essiaen, Olivier. 1942. Technique de Mon Langage Musical. Paris: Leduc. 

M eyer, Leonard B. 1956. Emotion and Meaning in Music. Chicago: University of Chicago Press. 

M insky, M arvin. 1974. "A Frameworkfor Representing Knowledge.” M IT Artificial IntelIigenceLaboratory M emo 306. 
Also in The Psychology of Computer Vision. Ed. P. H. Winston. New York: M cGraw-Hill, 1975. 

M insky, M arvin, and Seymour Papert. 1969. Perceptrons. Cambridge, M ass.: M IT Press. 

M oore, Brian. 1997. An Introduction to the Psychology ofHearing. 4th ed. San Diego: Academic Press. 

M oore, F. Richard. 1988. "The Dysfunctionsof MIDI." C My: Computer Music Journal 12 (1): 19-28. 

-. 1990. Elements of Computer M usic. Upper Saddle River, N .J.: Prentice-Hall. 

M orse, Marston. 1959. "M athematics and the Arts." Bulletln oftheAtomlc Scientlst, February. 

Nettheim, Nigel. 1992. "On the S pectral Analysis of Melody." Interface: Journal of New Music Research 21:135-148. 
Neumann,John von. 1963. "VariousTechniques Used in Connection with Random Digits.” In Collected Works. Vol. 5, 
768-770. Oxford: Pergamon. 

Norden, H. 1964. "Proportions in M usic.” Fibonacci Quarterly 2 (3): 219-222. 

O'Beirne, Thomas H. 1968. "940,364,969,152 Dice-M usic Trios." Musical Times 109 (October): 911-913. 

Ohm, G. S. 1843. “Über die Definition desTones, nebst daran geknüfter Theorie der Sireneund áhnlicher tonbildender 
Vorichtungen." Annals ofPhysical Chemistry 59: 513-565. 

Olson, Harry F. 1952. Music, Physics, andEngineering. New York: Dover, 1967. 

Oppenheim, D. 1996. “DMIX, A Multifaceted Environment for Composing and Performing." Computers and 
Mathematics with Applications 32 (1): 117-135. 

Pangle, Thomas L., ed. and trans. 1980. The Lawsof Plato. New York: Basic Books. 

Park, S. K., and K. W. M iller. 1988. "Random Number Generators: Good Ones Are Hard to Find." Communications of 
the ACM 31(10): 1192-1201. 

Partch, Harry. 1947. Genesisofa Music. M adison: University of Wisconsin Press. New York: Da Capo Press, 1979. 
Perle, George, and Paul Lansky. 1981. Serial Composition and Atonality. Los Angeles: University of California Press. 
Petri, C. A. 1976. "General NetTheory.” In Proceedings of IBM/University of Newcastle-upon-Tyne Seminar, 131-169. 
Also in Technische UniversitatBerlin, GMD-BerlchtNo. 3. M unich: Oldenbourg Verlag, 1979. 

Pierce,John R. 1983. The Science of Musical Sound. San Francisco: Scientific American Books. 

Pingle, B. A. 1962. History oflndian Music. Calcutta: Gupta. 

Pirsig, Robert M. 1974. Zen and the Art of Motorcycle Maintenance. New York: Morrow. 




470 


Refer enees 


Plomp, R. 1970. "Timbre as a M unidimensional Attribute of Complex Tones.” In Frequency Analysis and Periodicity 
Detectíon in Hearing. Ed. R. Plomp and G. Smoorenberg. Leiden: Sijthoff. 

Plomp, R„ and W.J.M. Levelt. 1965. "Tonal Consonance and Critical Bandwidth." Journal of the Acoustical Societyof 
America 38: 548-560. 

Pope, Stephen. 1986. "M usic Notations and the Representation of M usical Structure and Knowledge." Perspectives of 
New Music 24: 156-189. 

-, ed. 1991. The Well-Tempered Object: Musical Applications of Object-Oriented Software Technology. 

Cambridge, Mass.: M IT Press. 

Potter, Gary M . 1971. "The Role of Chance in Contemporary M usic." Ph.D. diss., School of M usic, Indiana University, 
Bloomington. 

Press, William H„ Brian P. Flannery, Saúl A. Teukolsky, and William T. Vetterling. 1988. Numerical Recipes in C: The 
ArtofScientific Computing. Cambridge: Cambridge U niversity Press. 

Putz, J. F. 1995. "TheGolden Section and the Piano Sonatas of M ozart." M athematics M agazine 68 (4): 275-282. 
Rameau, Jean-Philippe. 1722. Traite de I ‘Harmonie. Trans. Philip Gossett as Treatise on Harmony. New York: Dover, 
1971. 

-. 1737. "Generation Harmonique, ou Traite de M usiqueTheoriqueet Pratique— Proposition xi i." Trans. D. H ayes 

in "Rameau's Theory of Harmonio Generation." Ph.D. diss., Department of Music, Stanford University, Stanford, 
California, 1968. 

Ramos, Bartolomé. 1482. "M usica Practica." In Source Readings in Music History, 203-204. Ed. Oliver Strunk. New 
York: W.W. Norton, 1998. 

Révész, Geza. 1954. Introduction to the Psychology of Music. Norman: University of Oklahoma Press. New York: 
Dover, 2001. 

Roads, Curtís. 1984. "An OverView of M usic Representad ons." In Musical Grammars and Computer Analysis, 7-37. Ed. 
M ari o B aroni and L aura C al I egari. F i renze: Olschki. 

Roberts, L. A., and Max V. Mathews. 1984. "Intonation Sensitivity forTraditional and Nontraditional Chords .“¡ournal 
of the Acoustical Societyof America 75: 952-959. 

Roederer, J uan. 1973. Introduction to the Physics and Psychophysics of Music. London: E nglish U niversities Press. 
Rossing, Thomas D. 1983. The Science ofSound. Reading, Mass.: Addison-Wesley. 

Rothstein, Edward. 1995. EmblemsofMind: The Inner Life of Music and M athematics. New York: Times Books. 

Ruiz, PierreM. 1969. "A Technique for Simulating the Vibrations of Strings with a Digital Computer.” M áster'sthesis, 
Department of Music, University of Illinois, Urbana-Champaign. 

Rumelhart, D. E„ G. E. Hinton, and R.J. Williams. 1986. "Learning Internal Representations by Error Propagation." In 
Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Ed. D. Rumelhart and J. L. 
McClelland. Cambridge, Mass.: MIT Press. 

Sabine, W. C. 1921. Collected Paperson Acoustics. Los Altos, Calif.: Península Publishing, 1993. 

Samek, Miro. 2002. Practical Statecharts in C/C++: Quantum Programming for Embedded Systems. San Francisco: 
CMP Books. 

Santillana, Giorgio de, andHerta vonDechend. 1969. Hamlet'sMill. Boston: David R. Godine. 

Scaletti, Carla. 1989. "Composing Sound Objectsin Kyma." Perspectives of New M usic 27: 42-69. 

-. 1991. "The Kyma/Platypus Computer M usic Workstation." In The Well-Tempered Object: M usical Applications 

of Object-Oriented Programming. Ed. Stephen Pope. Cambridge, M ass.: M IT Press. 

Schank, R. C., and R. Abelson. 1976. Scripts, Plans, Goals, andUnderstanding. Hillsdale, N.J.: Erlbaum. 

Schenker, Heinrich. 1935. Der freieSatz: Neue MuslkalischeTheorien andPhantasien. Vienna: Universal. 

Schillinger, J oseph. 1948. TheMathematical Basis ofthe Arts. New York: DaCapo Press, 1976. 

Schouten, J. F„ R.J. Ritsma, and B. Lopes Cardozo. 1962. "Pitch ofthe Residue." j ournal ofthe Acoustical Societyof 
America 34:1418-1424. 

Schroeder, Manfred R. 1979. "Binaural Dissimilarity and Optimum Ceilings for Concert Halls: More Lateral Sound 
Diffusion ."Journal ofthe Acoustical Societyof America 65: 958. 



Seebeck, August. 1841. "Beobachtungen über einige Bedingungen der Entstehung von Tonen [Observations on Some 
Conditions for the Creation of Tones]." Armáis ofPhysical Chemistry 53: 417-436. 

Shannon, Claude E. 1948. "A Mathematical Theory of Communication." Bell System Technical Journal 27 (July): 
379-423; 27 (October): 623-656. 

Shannon, Claude E„ and Warren Weaver. 1949. The Mathematical Theory of Communication. Urbana: University of 
Illinois Press. 

Sharp, D. B. 1996. "Acoustic Pulse Refectometry for the Measurement of Musical Wind Instruments." Ph.D diss., 
University of Edinburgh. 

Shepard, R. N. 1964. "Circularity in J udgments of Relative P\td\.“ Journal of the Acoustical Society of America 36: 
2346-2353. 

-. 1982. "Structural Representationsof M usical Pitch." In ThePsychology ofMusic, 344-384. Ed. Diana Deutsch. 

San Diego: Academic Press. 

Slonimsky, N ¡cholas. 1948. Slonimsky's Book of Musical Anecdotes. New York: Alien, Towne, and Heath. New York: 
Routledge, 2002. 

Stevens, S. S. 1956. "Calculation of the Loudness of Complex Notse." Journal of the Acoustical Society of America 28: 
807-832. 

-. 1961. "Procedure for Calculating Loudness: M ark VI ."Journal of the Acoustical Society of America 33 (11): 

1577-1585. 

-. 1962. "The Surprising Simplicity of Sensory M etrics." American Psychologist 17: 29-39. 

Stevens, S. S., and E. B. Newman. 1936. "The Localization of Actual Sources of Sound." American Journal of 
Psychology 48: 297-306. 

Stroustrup, Bjame. 1991. TheC++ Programming Language. Reading, Mass.: Addison-Wesley. 

Strunk, Oliver, ed. 1950 . Source Readings ¡n Muslc History. 6th rev. ed. New York: W. W. Norton, 1998. 

Strutt.John W. 1907. "Our Perception of Sound Direction." Philosophical Magazine 13: 214-232. 

Stuckenschmidt, H. H. 1969. Twentieth Century Music. Trans. Richard Deveson. New York: McGraw-Hill. 

Stumpf, Cari. 1883/1890. Tonspsychologie [Tone Psychology]. Leipzig: Hirzel. 2 vols. 

Terhardt, E. 1974. "Pitch, Consonance, and Harmony ."Journal of the Acoustical Society of America 55:1061-1069. 
-. 1979. "Calculating Virtual Pitch." Hearing Research 1:155-182. 

Tiggelen, PhilippeJ. van. 1987. Componium: The Mechanical Musical Improvisor. Louvain-la-Neuve, France: Institut 
Supérieure d'A rchéologie et d'H istoire de I'A rt. 

Todd, Peter M . 1989. "A Connectionist Approach to Algorithmic M usic." CMJ: Computer Musicjournal 13 (4). Also in 
Music and Connectionlsm. Ed. Peter Todd and Gareth Loy. Cambridge, Mass.: M IT Press. 

Todd, Peter, and Gareth Loy, eds. 1989. Music and Connectionlsm. Cambridge, M ass.: M IT Press. 

Todd, Peter M„ and Gregory M. Werner. 1998. "Frankensteinian Methodsfor Evolutionary Music Composition." In 
Musical Networks: Parallel Distributed Perception and Performance. Ed. Niall Griffith and Peter M. Todd. Cambridge, 
Mass.: MIT Press. 

Turing, Alan M . 1950. "Computing Machinery and Intelligence." Mind 59: 433-460. 

Valk, R. 1978. "Self-Modifying Nets: A Natural Extensión of Petri Nets." In ICALP 1978: Proceedings of the 
International Conference on Autómata, Languages and Programming, Fifth Colloquium, 464-476. 

Vogt, Mauritius. 1719. Conclave Thesauri Magnae Artls Musicae. Cited in H. Kirchmeyer. "On the Historical 
Construction of Rationalistic M usic." DieReihe 8 (1962): 11-29, 20. 

Voss, Richard F„ and J. Clarke. 1975. "1/f Noise in M usic and Speech." Nature 258: 317-318. 

-. 1978. "l/f noise in M usic: M usicfrom 1/f Noise." J ournal of the Acoustical Society of America 63 (1): 

258-263. 

Wallach, H., E. B. Newman, and M. R. Rosenzweig. 1949. "The Precedence Effect in Sound Localization." American 
Journal of Psychology 52: 315-336. 

Ware, J. A„ and K. Aki. 1969. "Continuousand D¡serete Inverse Scattering Problems in a Stratified Elastic Médium. I: 
Planes at Normal Incidence.” J ournal of the Acoustical Society of America 45: 911-921. 



472 


Refer enees 


Warren, R. M . 1970. "Elimination of Biases in Loudnessjudgments for Tones ."Journal of the Acoustícal Society of 
America 48:1397. 

Wiggins, G. A., M . Harris, and A. Smaill. 1989. "Representing M usicfor AnalysisandComposition.” In Proceedingsof 
the2d International JointConference on Artificial Intelligence (IJCAI-89), Detroit, Workshop on Artificial Intelligence 
and Music, 63-71. 

Wightman, F. L. 1973. "The Pattern Transformation M odel of Pitch." ] ournal of the Acoustícal Society of America 54: 
407-416. 

Wilkinson, S. 1988. Tuning In: Microtonality in Electronic Music. M ilwaukee: Hal Leonard Books. 

Xenakis, lannis. 1955. "The Crisis of Serial M usic." Gravesaner Blatter. 

-. 1971. FormalizedMusic. Bloomington: Indiana University Press. 

Yasser, Joseph. 1932. Theory of Evolving Tonality. New York: American Library of M usicology. New York: Da Capo 
Press, 1975. 

Yost, William A. 2000. FundamentalsofHearing: An Introduction. 4th ed. New York: Academic Press. 

Young, Thomas. 1800. "Of the Temperament of Musical Intervals." Philosophical Transactions. Royal Society of 
London. 

Zwicker, E. 1961. "Subdivisión of the Audible Frequency Rangeinto Critical Bands (Frequenzgruppen)."Journa/ ofthe 
Acoustícal Society of America 33: 248. 

Zwicker, E., and H. Fastl. 1990. Psychoacoustícs: Factsand Models. Berlín: Springer-Verlag. 

Zwicker, E., and R. Feldtkeller. 1955. "On the Derivation of Critical Bands from the Loudness of Complex Sounds. 
Acústica 5: 40-45. 

Zwicker, E., G. Flottorp, and S. S. Stevens. 1957. "Critical Bandwidth in Loudness Summation." Journal ofthe 
Acoustícal Society of America 29: 548-557. 



Equation Index 


Acoustical U ncertainty (6.10), 183 
Acoustic Mean Free Path 
(A.11), 416 

Angular A cceleration (5.8), 132 
AngularDisplacement (5.1), 129 
A ngular D isplacement with elapsed 
time (5.7), 132 
Angular Frequency (8.5), 243 
AngularVelocity (5.6), 131 
Average A cceleration (4.12), 104 
Average A cceleration (4.14), 104 
Average M ass, Atom of A ir 
(7.6), 204 

Average Power (4.32), 114 
Average Speed (4.8), 102 
Average Surprisal (9.17), 346 
Average Velocity (4.10), 102 

Bar with Free Ends (8.19), 260 
Bark Number (6.9), 182 
Bel Scale, The (4.38), 120 
Bohlen-Pierce Equal-Tempered 
Scale (3.23), 92 

BoItzmann'sConstant (7.8), 206 
Brownian Noise (9.24), 355 

Cantilever Beam (8.18), 259 
Cent (3.7), 45 
Cent Interval (3.8), 45 
Centripetal A cceleration 
(5.12), 134 
Cosecant (A .9), 413 
Cosine Relation (A .4), 412 
Cotangent (A .7), 413 
Critical Bandwidth (6.8), 180 

dB SIL (5.31), 147 
dBSPL (5.32), 147 
Decibel, The (4.40), 120 
D eci bel-to-l ntensity 
Conversión (4.41), 121 
Decibel Scale, The (4.39), 120 


Diffraction (7.22), 227 
Dimensión (9.22), 353 
Displacement (4.7), 101 
DopplerShift (7.23), 229 
Doppler Shift, Both 
Move (7.25), 231 
DopplerShift, Receiver 
Moves (7.24), 231 
Doppler Shift in Two 
Dimensions (7.26), 232 
Driving Forcé (8.25), 270 
Drum Head Mode 
Frequencies (8.23), 268 
Drum Vibration (8.24), 268 

Elapsed Time (4.9), 102 
Equal-Tempered Intervals 
(3.1), 40 

Equation ofMotion (8.11), 249 
Exponential Attack (8.31), 281 
Exponential Decay (8.29), 281 

First Backward Difference 
(4.6), 101 

First Fret (3.16), 83 
Forcé of Gravity (4.25), 109 
Frequency (4.1), 99 
F requency M odes of a Pi pe C losed 
atOne End (8.21), 264 
F requency M odes of a P i pe O pen at 
Both Ends (8.20), 264 
F requency Related to A ngular 
Velocity (5.17), 136 

Gravitational Potential 
Energy (4.30), 112 
Gravitational Work (4.29), 112 

HeatCapacity Ratio (7.4), 203 
Helmholtz Resonator 
(8.8), 245 

Hooke'sLaw (8.1), 240 


Ideal Gas Law (7.7), 205 
Ideal Gas Law using Boltzmann's 
Constant (7.9), 206 
Information (Entropy) (9.19), 346 
I nstantaneous A cceleration 
(4.15), 104 

I nstantaneous Velocity (4.11), 103 
Intensity Rangeof Hearing 
(4.37), 119 

Interaural Time Difference 
(ITD) (6.11), 189 
Interval (2.2), 14 
Inverse Cent (3.9), 46 
Inversión (9.8), 315 

J ust N oticeable D ifference 
(JND) (6.1), 160 

Kinetic Energy (4.28), 111 

Law of Reflection (7.18), 210 
Length oían Are (5.5), 131 
Linear Interpolation (9.10), 324 
Logistic Function (9.29), 378 
Longitudinal Bar (8.16), 257 

MassDensity (7.5), 204 
M aximum Velocity of Simple 
Harmonic Motion (5.25), 142 
MiddleC (3.5), 41 
Mode Length (8.13), 256 

N ewton's Second L aw of 
Motion (4.24), 108 

Octaves (2.1), 14 

Partitioning (9.4), 310 
Peak PressureLevel (5.28), 144 
Peak-to-Peak Pressure 
Level (5.29), 144 
Pendulum Frequency (8.6), 243 




474 


Equation Index 


Period (4.2), 99 
Period Related to Angular 
Velocity (5.15), 136 
Phon/Sone Conversión 

(6.6) , 170 
Pistón Frequency 

(8.7) , 244 

Pressure (4.33), 118 
Probabilityand Surprisal 
(9.14), 345 

Quality Factor (8.26), 277 

Radian (5.4), 131 
Radian Velocity (5.16), 136 
Rayleigh Distance (7.17), 209 
Redundancy (9.21), 347 
Reflection (7.19), 215 
Refraction (7.21), 218 
Relation of Tangent, Sine, and 
Cosine (A.6), 413 
Rotational Energy (5.27), 144 


Rotational Speed (5.9), 132 
Rotational Velocity (5.26), 143 

Sabi ne's E quation for R everberati on 
Time (7.31), 237 
Secant (A .8), 413 
Second Fret (3.17), 83 
Second-0 rder C entral D ifference 
Approximation (4.16), 106 
Semitonelnterval (3.2), 40 
Sine Relation (5.18), 137 
Sine Relation (A .3), 412 
Sones and I ntensity (6.7), 170 
Specific HeatCapacity (7.3), 203 
Speed ofSound (7.14), 207 
Speed ofSound (7.2), 202 
Speed of Sound atSTP 
(7.15), 207 

StevensLaw (6.3), 160 
Stretched M embrane (8.22), 267 
StringM ode Frequency (8.15), 256 
Surprisal (9.16), 346 


Taking nllnordered Objectsrata 
Time (9.6), 311 
Tangent Relation (A.5), 412 
Tangential Speed (5.14), 136 
Thermodynamic Probability 
(Entropy) (9.20), 346 
Total Mechanical Energy (4.31), 112 
Transmission (7.20), 216 
Transposition (9.7), 314 

Uncertainty (9.18), 346 
U nit Interpolation (9.9), 323 
U niversal Wave Equation 
(7.16),207 

Vibrating Frequency (8.4), 242 

Weber-FechnerLaw (6.2), 160 
Weierstrass Function (9.23), 355 
Work (4.26), 110 



Subject Index 


l/f spectral tendency, 354, 359 
12-tone composition. See 
methodology 

12-tone row. See tone rows 
2A FC. See two-alternative 
forced-choice 

A 440, 12, 14, 40-42,49, 99,444 
absolute refractory period, 158 
absorption, 199, 221-222, 236, 416 
ofair, 237 
total, 222 

absorption coefficient, 222,418 
acceleration, 5, 99,104-109, 241, 
248-249, 274-275 
angular, 132 
as bending, 104 
centripetal, 134 
instantaneous, 104,106 
accidentáis, 21 
acoustical shadow, 208 
acoustic pulse reflectometry, 212 
acoustic reflex, 152 
acoustics, 150 
architectural, 233, 235, 416 
Ohm's law of, 157 
adiabatic, 200 

ADSR. Seeattack, sustain, decay, 
release 

Aeolian mode, 20 
aerophones, 251 
Al. See artificial i ntel I i gence 

elastic properties, 201-202 
inertial properties, 201-202, 204 
air column, 263 
Airydisc, 225-226 
algorism, 288 
algorithm, 288, 290 
amanuensis, 293 
American National Standards 
Institute, 157 


amplitude, 34,117,139-147 
máximum, 274 

anatomical transíerfunction, 190 
anechoic chamber, 194, 221 
angle 

critical, 219 

ofincidence, 210,218-219 
angle of incidence, 218 
ANSI. See American National 
Standards Institute 
antinodes, 254, 263 
antiresonance, 36 
apotome, 53 

area, 98,122, 205, 209, 222, 236 
density, 100 
surface, 222, 235 
arguments, 42, 426 
actual, 431 
formal, 431 
arithmetic mean, 48 
art, 289 

Artificial I ntel I i gence, 372 
artificial neural networks, 376-378, 
388-389 

ASCII charactercode, 438, 450 
associativity, 425 
asymptote, 280 
ATF. See anatomical transfer 
function 

atm. See atmosphere 
atmosphere, 205 
attack, 35 

attack transí ents, 183,198 
attack, sustain, decay, release, 36 
auditory canal, 151 
auditory nerve, 153 
auditory sceneanalysis, 150 
aural sensibility, 403-406 
average molecular mass, 204 
Avogadro’s number, 204 
axis of rotation, 129 
azimuth, 188 


Babbitt, Milton, 331 
Bach.C.P.E., 296 
Bach, J. S., 70,184, 326, 363, 388 
back propagation of error, 378, 380 
backtracking, 362 
band, 36 
band center, 36 
bandwidth, 36, 277 
of human hearing, 36 
bark scale, 181 
bar Unes, 26 
barometer, 124 
bars 

withfreeends, 260 
longitudinal, 256 
transverse, 258, 261-262 
Bartók, Béla, 349 

basilar membrane, 6,153-154,158, 
178,184, 239 
beat frequency, 173,185 
beats (acoustical), 51, 53,173-178 
first-order, 174 
second-order, 174 
beats (musical), 26-27,447-448 
bel, 120 

Bell, Alexander Graham, 120 
bel I s, 261 

Benedetti, Giovanni Battista, 57 
Berg, Alban, 311 
Bessel functions, 267 
Bielawa, Herbert, 372 
binary infix, 423 
bit, 345, 347 
Bohlen, Heinz, 87 
Bolero, 175 

Boltzmann's constant, 206, 346 
bore conical, 265 
bore cylindrical, 265 
Boulanger, Richard, 93 
Boulez, Pierre, 293, 331 
broadband, 36 
Brown, Earle, 293 



476 


Subject Index 


Brown, Robert, 356 
Brownian 
motion, 356 
no i se, 355-357 
number generator, 356 
Brün, Herbert, 295 

cacophony, 306 

Cage,John, 293, 298-299, 350 

Calder, Alexander, 293 

cantilever beams, 259 

cardinal points, 139 

cardinality, 317, 462 

cartesian coordinates, 97, 352 

causal, 299, 372, 390 

causal ity, 371-372 

centscale. Seescales 

central Processing theories, 158 

chaotic, 304 

character set, 438 

chimes, 261 

Chomsky, Noam, 401 

Chopin, Frédéric, 401 

chordophones, 251 

chroma, 15,163-165,313 

chromatic scale. See scales 

ChuTsai-yu, 70 

cilia, 153 

circleof fifths, 23 

circular harmonic motion, 243 

circular motion, 129 

clang tone, 261 

clarinet, 264 

clef, 12 

CM N. See common music notation 
cochlea, 153 
cognition, 376 
combination tones, 175 
combinatorios, 306, 311 
comma 

of Didymus, 51 
Pythagorean, 54, 67-70, 80 
Syntonic, 51, 55 
common music notation, 12 
common time, 27 
compass interval, 87 
complex tones, 157-158,161 
complexity theory, 305-306 
compliance, 240,244 
components, 29 
Componium, 297 
composablefunction, 317 
composition, 317 
compound statement, 428 
compressibility, 244 
conditional probability, 367 
cone of confusión, 190 


confounding factors, 402 
congruence, 301,414 
connectionism, 377 
conservative forces, 275 
consonance, 56-60, 87,92-93,178, 
184-186, 380-383, 385, 418 
perfect, 185 
of perfect intervals, 60 
constantQ, 182 
continuedfractions, 85 
continuous distribution. See 
distribution 
contralateral, 187 
convolution, 335 
coordínate System, 97, 352 
Cope, David, 400 
creep wave, 219 
crescendo, 32 
critical angle, 219 
critical bands, 176,178-182 
crumhorn, 263 
cybernetics, 360 
cycle, 319 

damping, 8, 276 
d'Arezzo, Guido, 285,407 
dB. See decibel 
dB SIL, 146 
dB SPL, 147 

DEC. Seedynamically expanding 
context 
decay, 36, 279 
decibel, 120 
deconstructionism, 350 
decrescendo, 32 
Deep Blue, 403 
degenerate, 371 
degrees 

angular, 130-131, 411 
of the diatonic scale, 17 
of freedom, 187, 249-250, 254, 
278, 282 
interval size, 439 
per octave, 439 
of a scale, 16 
density, 201 
area, 100 
cubic, 100 
linear, 100 

deRore, Cipriano, 57 
design, 406 

deterministic, 290, 304 
diabolusen música, 53 
diatonic scale. Seescales 
dichotic, 158 
difference frequency, 173 
difference tones, 175 


diffraction, 222, 227-228 
F raunhofer, 225 
Fresnel, 225 
pattern, 223 
diffuse, 416 
dimensión, 352 
directed graph, 367 
acyclic, 368 
cyclic, 368 
directrix, 217 
direct signal, 233 
dispersión, 211 
dispersive effect, 218 
displacement, 100,106, 239 
angular, 129,131 
antinodes, 263-264 
nodes, 263-264 
dissipation-limited, 275 
dissonance, 56, 93,184-186, 380 
distortion, 121-122,179, 239 
distribution 
continuous, 334 
discrete, 334 
probability, 333 
uniform, 299, 333 
distribution function 
cumulative, 339 
probability, 336 
Dodge, Charles, 299 
dominant, 17 
Doppler shift, 228-232 
driven harmonic oscillators. See 
oscillators 
drums, 266, 270 
dúplex theory, 190 
duration, 26 
dynamical, 280 

dynamical system. See Systems 
dynamically expanding context, 
373-375, 379,403 
dynamic range, 27 
dynamic spectrum, 33 
dynamics, 304 

eardrum, 151 
early reflections, 233 
Ebcioglu, Kemal, 363 
echoes, 211 
flutter, 235 
late, 234 
slap-back, 235 
efficiency, 115 
Einstein, Albert, 356 
elasticity, 239-241 
EM i. See Experiments in M usical 
Intelligence 

end correction, 245, 264 



Subject Index 


477 


endolymph, 153 
energy 

elastic potenti al, 112 
gravitational potenti al, 112 
intemal, 200, 202 
kinetic, 111 
macroscopic, 200 
microscopio, 200 
potential, 111 
tensi le potenti al, 112 
total mechanical, 112 
energy distribution, 30, 32 
spectral, 195-196 
temporal, 196 
enharmonic equivalents, 21 
entropy, 345-349, 354 
máximum, 347 
enumerad on, 296, 307, 310 
envelo pe, 2 
amplitude, 34-35 
spectral, 34,195 

equal loudness contours, 167-168 
equal temperament, 39 
equal-tempered scale. See scales 
equilibrium, 239, 248 
dynamic, 248 
Static, 239, 241, 248-249 
esraj, 266 

Euclid'smethod, 288, 430 
Eurythmics, 407 
expectation, 347-348, 350 
experimental method, 402 
Experiments in M usical 
Intelligence, 400 
expressions, 422 

farfield, 125, 209 

fBm.Seefractional Brownian motion 
feedback, 378 
Fibonacci sequence, 436 
fife, 263 
final, 20 

firstbackward difference, 101 
five-limit, 60 
fíat, 21-22 
floorfunction, 303 
flute, 263-264 
focusing effect, 218 
Fogliano, Lodovico, 60 
forcé, 108, 248 
conservative, 113 
elastic, 247 
inertial, 247 
kinetic frictional, 110 
nonconservative, 113 
sliding frictional, 110 
static frictional, 109 


forced motion, 270 
forces 
contact, 109 
external, 112 
internal, 112 
noncontact, 109 
formants, 36 
Foster, Stephen, 364 
Fouriertransform, 226 
fractals, 350, 352-360 
deterministic, 353 
random, 353 

fractional Brownian motion, 357 
fractional dimensions, 353 
Fraunhofer diffraction, 227 
Fraunhofer región, 209 
free motion, 270 
frequency, 13, 99,136,141-142 
radian, 243 

frequency resolution, 162 
F resnel zone, 209 
frets, 253 
fundamental, 29 
fundamental frequency, 29, 37 

Gabor, Dennis, 333 
Galilei, Vincenzo, 68, 82 
gamut, 16 

GC D. See greatest common 
divisor 

genetic programming, 389 
geometric mean, 64 
glissando, 32, 253 
glockenspiel, 261 
golden mean, 349-350, 437 
goodness, 406 

goodness-of-fitmetric, 71-92 
gravity, 100-113, 242, 248 
greatest common divisor, 288 
Guido's method, 291, 407 
Guidonian hand, 286 

haircells, 153 
halving time, 280 
HARMONET, 388 
harmonio mean, 48 
harmonio oscillators. See oscillators 
harmonio proportion, 48 
harmonio series. See series 
harmonios, 29-37, 47-48,158, 
263-265, 282, 355 
harmony, 12,14, 29, 60, 361, 407 
functional, 86, 331 
of the spheres, 47 
harmony theory, 54,164 
hartley, 347 

Haydn, Joseph, 296, 406 


head-related transí er function, 190 

head room, 121 

heat, 202 

heat capacity, 202 

heat capacity ratio, 201 

H eisenberg's uncertai nty pri nci pie, 32 

hel i cotrema, 153 

Helmholtz resonator, 244-247 

hertz, 99 

histogram, 316, 364 
homogeneous, 416 
Hooke’slaw, 241, 272-273 
HRTF. See head-related transfer 
function 
hum note, 157 
H urst exponent, 355 
Huygen'sprincipie, 223 
H z. See hertz 

IChing, 298 
ideal gas, 200 
ideal gaslaw, 205 
idiophones, 251 

ILD. See interaural level difference 

llliac Suite, 360, 364,402 

impulse response, 233 

incus, 152 

Índex operator, 426 

inertia, 99-100, 248, 273-274, 356 

inertia-limited, 275 

inertial reactance, 249 

information, 343, 346 

information theory, 343-350 

inharmonic, 29-30 

integer, 14 

intensity, 118 

interaural level difference, 

188,190 

interaural ti me difference, 188,190 
interference 
constructive, 210, 224 
destructive, 210, 225 
pulse, 326 
interpolation 
linear, 324 
unit, 323 
interval 
affinity, 164 
class vector, 317 
equivalence, 14 
identity, 14 
individuality, 15 
order, 17 
intervals, 14 
augmented, 18-19 
augmented fourth, 19 
cent, 45-46 



Subject Index 


intervals (cont.) 
diminished, 18-19 
diminished fifth, 19 
feeling, 18 
fifth, 22 
fourth, 19 
half step, 17 
imperfect, 43 
inversión, 44 
just, 43, 57 
major/minor, 19 
perfect, 18-19, 43 
Pythagorean chromatic 
semitone, 53 

Pythagorean diatonic semitone, 53 

second, 99 

semitone, 17,42 

sonority, 19 

sruti, 77 

tempered semitone, 40 
tritone, 19, 383 
unisón, 19 

intonation sensitivity, 92 
inversión, 315 
lonian mode, 20 
i psi lateral, 187 
ISO-10646, 438 
isotropic, 416 

ITD. See interaural timedifference 
Ives, Charles, 73 

jaw harp, 263 

J N D. See just noticeable difference 
Joplin, Scott, 401 
joule, 111, 202 

just noticeable difference, 159 
just noticeable loudness 
threshold, 177 

key, 22 

key signature, 22 

Kimberger.Johann Philip, 69, 295 
Koch snowflake, 353 
Koenig, Gottfried M., 332 

lacunarity, 355 
laminar flow, 221 
land speed of sound, 220 
lateral onsetcue, 189 
law of inertia, 108 
leadingtone, 24 
learning, 372-389 
left-hand side, 424 
legato, 36 

Ligeti, Gyórgy, 332 
limit, 103 

limitof hearing, 119 


limma, 53 

linear congruential method, 301 
local minimum, 382 
logistic function, 378 
lossy, 179 
loudness, 119,167 
loudnessJND, 167 
lowestcommon denominator, 15 
Lydian mode, 20 

magnitude, 135 
majorscale. Seescales 
mal leus, 152 

M alzel's metronome, 27,447 
M andel brot, Benoit, 353 
marimba, 263 
marking, 398 
Markovchains, 363-371 
masker, 171 
masking, 171 
backward, 172 
forward, 171 
frequency, 171 
simultaneous, 172 
temporal, 171 
mass, 99, 249 
massdensity, 204 
matter, 99 
maxima, 144, 254 
mean free path, 356 
mean valué, 145-146 
measures, 26 
meatus, 151 
mediant, 17 
melody, 12,14, 29 
membranes, 266 
membranophones, 251 
Mersenne, Marin, 58 
Messiaen, Olivier, 331 
meter, kilogram, second, 97 
methodology, 285, 288, 290 
12-tone, 86, 312, 332 
compositional, 311, 350, 402 
deterministic, 290 
experimental, 155 
nondeterministic, 289 
metronome, 27 
metronome mark, 26 
M icrologus, 286 
microphones, 124 
microtonality, 72-82 
microtonal scales. Seescales 
mi crotones, 72 
middleC, 41 
Pythagorean, 49 
MIDI. See Musical Instrument 
Digital Interface 


millisecond, 99 

minima, 144 

minorscale. Seescales 

missing fundamental, 157-158 

M ixolydian mode, 20 

M KS. See meter, kilogram, second 

M M. See M alzel's metronome 

modes (scales), 20 

modulation, 54 

modulo arithmetic, 301, 414 

mol. See mole 

mole, 204 

monotony, 306 

MonteCario methods, 360-362 
motion generator, 271 
Mozart, W. A., 296, 375, 401, 
403-406 

MP3,170,179-180, 344 
MPEG, 170,179 
M usamaton, 326 
music, 407 
atonal, 86, 312 
automated composition, 297 
experimental, 402 
programming, 292 
representation, 292, 327 
Musical Instrument Digital 
I nterface, 292 
musical score, 12 
musical style, 350, 363, 400-406 
music dictation, 343 
music engineering, 13, 47, 63, 295 
music notation. See common music 
notation 

music technology, 39 
M usikalische Würfelspiel, 
295-298,400 

M USIMAT, 285, 290-292, 309, 317, 
324, 415, 421-451 

Abs (), 437 
Accidental (), 440 
accumulate (),340 
Atan (), 437 
brownian (), 356 
Ceiiing (), 434 
Character, 438 
cycle (), 320-321 
dB (),449 
Do-While, 429 
Else, 428 
factorial (),435 
Fi(), 436 
Floor (), 434 
For, 429-430 
Fr(),436 
getlndex (), 340 
Halt (), 427 






Subject Index 


479 


if, 428 

Integer, 422, 434 
IntegerList, 423 
invert (), 318 
key (),440 

linearInterpólate(),324 

loglO (), 427 

mm(),448 

Mod, 427 

Mod(), 415 

normalize(),339 

Octave (), 440 

palindrome(),321 

permute(),322 

Pitch, 440 

PitchClass 0,440 

PitchList, 442 

pitchToHz(),441 

PosMod (), 415 

Pow (), 427 

Print 0,427 

Random (), 304, 337, 429,437 
randomRow (), 328-329 
randTendency (),330 
Real, 422, 434 
RealList, 338, 423 
realRhythm (),447 
realToRational (),446 
Reference, 446 
Repeat, 429 
retrograde(), 318 
Return (),431 
RhythmList, 447 
setComplex(),319 
SetTempo (),448 
shuff.le (), 329 
Sqrt (),437 
stretch (),325 
String, 438 
transpose (), 318, 322 
VossFracRand (),358 
mutation stops, 175 
Myhill,John, 327 

nanosecond, 99 
narrowband, 36 
nat, 347 
natural, 21 
natural modes, 250 
nazard, 175 
nearfield, 125, 209 
nested functions, 431 
neural networks, 376, 378,403 
Newton'sfirst law of motion, 108, 
272-273, 356 

Newton's second law of motion, 
108, 248 


Newton's third law of motion, 108 
nodes, 254 
noise, 157 

nonsustaining Instruments, 115 
normal, 109 
forcé, 109 
form,313 
normal modes, 250 
note, 12 
symbols, 12 
numero senario, 60 
nut, 82 

objective composition, 286 
octave, 14,16 

octave equivalence, 14,16, 87 
Ohm's law of acoustics. See 
acoustics 
onset, 26 
onsettime, 26 
operands, 423 
Oracle, 290 
orchestrion, 297 
organ of Corti, 153 
organum, 286 
origin, 100 

orthogonal, 97-98, 316, 352, 412 

oscillation, 8 

oscillators 

driven harmonio, 270-271 
harmonic, 247, 273, 277-278 
ossicles, 152 
oval window, 153, 217 
overtones, 29 
overtone series. See series 

palindrome, 321 
parallel, 244 

parallel distributed Processing, 377 
Parmenides, 414 
Partch, Harry, 47, 60, 74-75 
partíais, 29-37, 47,157-158, 240 
partitioning, 310 
Pascal, 118 

pattern completion, 375 
PCM. See pulse-code modulation 
PDP. See parallel distributed 
Processing 
peak pressure, 144 
peak pressure level, 144 
peak-to-peak pressure level, 144 
pendulum, 243 
pennywhistle, 263 
pentatonic scale. Seescales 
perilymph, 153 
period, 98,136 
periodicity, 99,141 


periodicity theory, 158 
peri pheral theories, 158 
permanentthreshold shift, 152 
permutati on, 307, 322 
circular, 308, 313 
Petri nets, 390-400 
phase, 140 
of matter, 202 
phase angle, 141 
phase offset, 141 
phase reversal, 213 
phase shift, 141 
phon, 167 
phon scale, 167 
Phrygian mode, 20 
piano, 262 
Pierce,John, 87, 371 
pinna, 151 
pipe organ, 263, 355 
pipes 

closed one end, 264 
open both ends, 263 
pistón, 244 
pitch, 13, 439 

pitch classes, 16,164, 312-336 
pitch difference limen, 160 
pitch J ND, 159-163 
pitch space, 164 
pizzicato, 254 
place theory, 154 
pointof equilibrium, 4, 248 
points of inflection, 254 
political economy, 86 
polynomial, 300 
cyclic, 300 
expansión, 300 
polyphony, 54 
portamento, 253 
power, 114 
precedence, 424 
precedence effect, 193 
precession, 57 
precomposition, 312 
predícate, 428 

predicate/transition nets, 398 
pressure, 118, 205 
pressure waves. See waves 
prime form, 316 
prime numbers, 58 
principal valué, 413 
probability, 333-343 
probability distribution. See 
distri bution 
proportion, 407 
proportional analysis, 349 
proximity effect, 125 
PrT. See predicate/transition nets 









Subject Index 


pseudorandom, 300 
psychoacoustics, 150,154 
psychophysics, 155 
Ptolemy, Claudius, 54 
pulse interference. See interference 
pulse-code modulation, 179 
P-waves, 207 
Pythagoras, 47-48 
Pythagorean comma. See comma 
Pythagoreans, 47-48,406-407 

quadrivium, 407 
quality factor, 181, 277 

rad. See radians 
radians, 130-131 
radiation pattern, 208 
radiusof gyration, 260 
Ramos, Bartolomé, 56 
random numbers, 300-301, 303, 
337, 361-362,429 
random variable. See variable 
random walk, 355 
rational approximation, 81,446 
Ravel, M aurice, 175 
Rayleigh distance, 209 
reactance, 249 
real, 14 

real numbers, 14 
recorder (the instrument), 263 
recurrence relation, 301 
recursi on, 84, 435 
recursive, 280, 355 
redundancy, 343, 347-354, 397 
reflection, 210-218 
diffuse, 211 
specular, 210 
refraction, 218-221 
relative major, 23 
relativeminor, 23 
release, 36, 279 
remaindering, 415 
resonance, 36, 245, 270 
resonantfrequency, 274 
responseamplitude, 273 
response pattern, 156 
resting length, 273 
restoring forcé, 239 
rests, 26 

retrograde, 315, 318, 321 
reverberation, 211 
tail, 234 
time, 281,417 
revolution 
aesthetic, 87 
of ádrele, 130-131 
scientific, 98 


Rhythmicon, 326 
right-hand si de, 424 
ringing, 279 

RM S. See root mean squared 
RMS amplitude, 146 
root mean squared, 146 
rotation, 308 
roughness, 195 
round window, 153 
rubato, 26 
rule of 18, 82 

sabine, 236 
Sabine, Wallace, 236 
samplespace, 333 
sarod, 266 
scala media, 153 
scalatympani, 153 
scala vesti buli, 153 
scales 
19-tone, 73 
53-tone, 73 
Bohlen-Pierce, 444 
Bohlen-Pierce chromatic, 91-92 
Bohlen-Pierce equal-tempered, 92 
Bohlen-Pierce just diatonic, 89 
cent, 45, 74-75,459 
chromatic, 20-21, 23, 25, 45-46, 
166, 312, 337, 341, 359, 380, 
439, 441 

diatonic, 17, 20,22 
dodecaphonic, 46, 307-308 
equal-tempered, 39-42,45-46, 

70, 77 

harmonic minor, 23 
heptatonic, 46 
Hungarian minor, 25 
just, 43 

justpentatonic, 44 
major, 18 

mean-tone tempered, 63 
melodic minor, 24 
minor, 18 

natural chromatic, 54-56, 71-72, 
74, 79, 443 
natural minor, 24 
Partch 43-tone, 76,444 
pentatonic, 23, 46 
Pythagorean chromatic, 442 
Pythagorean diatonic, 49 
Pythagorean dodecaphonic, 52, 
54-55, 79 

quarter-tone, 73, 444 
sruti, 77 

Syntonic diatonic, 55 
whole-tone, 25, 317 
scattering, 199, 211 


Schenker, Heinrich, 401 
Schillinger, Joseph, 325 
schisma, 81 

Schoenberg, A mold, 86, 306, 
311-319, 331, 350 
scope 
global, 432 
local, 432 
search 

comparative, 363 
constrained, 363 

Second Viennese School, 86, 311 
self-similarity, 351, 353 
semitone. See intervals 
sensitivity to initial conditions, 305 
serialism, 331-333 
series, 244, 312, 332, 410 
arithmetic, 410 
finite, 410 
Fourier, 333 
geometric, 410 

harmonic, 37, 43, 50, 54-55, 62 
infinite, 410 
octave, 37 

overtone, 47-48, 51, 55, 60, 67 
set, 306, 312-332 
aggregate, 317 
setclass, 314-317 
complement, 317 
setcomplex, 318-319 
shadow, acoustical, 208 
Sharp, 21-22 
sharpness, 195 
Shepard tone ¡Ilusión, 165 
SI. See Systéme I nternational 
d'Unités 

sigma notation, 410 
signal, 199 

signal to noise ratio, 200 
signum function, 378 
simple harmonic 
motion, 4-7, 240 
sine relation, 137, 412 
sinusoid, 7 
sinusoidal, 7 
solmization syllables, 17 
soné, 167 
sonescale, 170 
sonogram, 33 
sonorities, 18 
sound intensity level, 120 
sound localization, 187-194 
sound pressure level, 117,123 
sound quality, 28 
specific heat 
capacity, 203 
spectral tendeney, 352 



Subject Index 


spectrum, 30 
harmonio, 31 
inharmonic, 31 
speed, 101 
instantaneous, 103 
rotational, 132 
tangential, 135 
speed of sound, 202,207 
SPL. See sound pressure level 
spreading, 199 

spring constant, 240, 244, 249 
sruti, 77 
staff, 12 

standard atmospheric pressure, 117 
standard temperatureand pressure, 205 
standing waves, 255 
stapedius, 152 
stapes, 152 
statement, 422 
static spectrum, 32, 34 
steady State, 279 
stiffness, 240, 248, 257 
stiffness-límited, 274 
Stockhausen, Karlheinz, 293 
STP. See standard temperature and 
pressure 

Stravinsky, Igor, 405 
strike note, 157 
strings 
ideal, 254 
stiffness, 262 
tensión, 262 
style. See musical style 
subdominant, 17 
submediant, 17 
subtonic, 17 
sum tones, 175 
summation, 410 
superdominant, 17 
superparticular ratios, 48 
superposition, 210, 251 
supertonic, 17 
supervised learning, 379 
surfacearea, 117-118, 235 
surprisal, 345 
surprise, 350 
sustai n, 36 

sustai ni ng instruments, 115 
synchronicity, 298 
Syntonic comma. See comma 
Systéme International d'Unités, 97 
systems, 149 
adiabatic, 200 
analysis/synthesis, 400 
auditory, 150 

automated composition, 401 
belief, 350 


causal, 372 
chaotic, 304 
complex, 306, 361, 398 
composing, 402 
deterministic, 304 
discretedynamical, 397,400 
dynamical, 248, 304-305 
expert, 388 
nonlinear, 176 
open, 77 

random, 304, 337 
resonant, 36 
rule, 373 
scale, 17,72-73 
signaling, 149, 343, 345 
spring/mass, 5,136, 243, 248, 
271-272 

statistical composing, 333 
tuning, 69, 75 
vibrating, 8, 29-30, 270 

T60 time, 281 
tangentrelation, 412 
taste, 350, 406 
tectori al membrane, 153 
tempered tuning, 20, 54, 68-72 
tempering, 63-64, 68-70, 73, 75 
equal, 70 
irregular, 69 
well, 69 
tempo, 26, 447 

temporary threshold shift, 152 
tensión, 110 
tensor tympani, 152 
thematicism, 331 
Theremin, León, 326 
thermodynamic probabil ity, 345 
threshold of hearing, 119 
tierce, 175 
timbre, 28,195-198 
time, 106 

time constant, 281,417 
time signature, 27 
tonal fusión, 174 
tonal harmony, 86-87, 312, 350 
See also harmony 
tonal palette, 69 
tone, 11-12 
tone height, 163 
tone rows, 308-313,319-332 
tonic, 17, 24 

tonotopic dissonance, 185 
tonotopic mapping, 154 
tonverschmelzung, 174 
total absorption, 222 
totally organized music, 331 
transformen 217 


transíents, 278, 280 
transition table, 365 
transpose, 22,44 
transposition, 314 
tremolo, 32,173, 254 
tritave, 87 

tritone. See intervals 
TTS. See temporary threshold shift 
Turing test, 403-405 
two-alternativeforced-choice, 161 
two-component theory 
of tone, 163 
tympani, 267 
tympanum, 151, 217 

unary prefix, 423 
uncertainty, 31, 343, 346 
acoustical, 183-184 
measurement, 305 
Unicode, 438 

uniform circular motion, 129 
uniform distribution. See 
distribution 
unisón, 14,16,185 
unitcircle, 139 
unitdistance, 323 
universal gas constant, 205 

variable, 421-422 
actual, 309 
continuous, 335 
control, 430 
global, 432 
independent, 15 
initialization, 430 
input, 430 
local, 432 
physical, 155 
psychoacoustic, 155 
random, 334, 336, 358 
reference, 321 
velocity, 5,102,106 
angular, 131,136, 243 
instantaneous, 102 
radian, 136 
rotational, 203 
of simple harmonio 
motion, 142 
tangential, 135,142 
translational, 200, 203 
vibrational, 203 
vibraphone, 261, 263 
vibration, 4-8 
vibration modes, 30 
vibrato, 28, 32 
Virtual Mozart, 404 
volume, 98, 235-236 



Subject Index 


wavelength, 141-142 
wave motion 
longitudinal, 116 
torsional, 116 
transverse, 116 
waves, 3 
compression, 

201, 207 
crests, 140 
cyde, 140 
expansión, 201 
incident, 215 
longitudinal, 207 


period, 140 
plañe, 209 
pressure, 207 
rarefaction, 207 
reflected, 215 
transmitted, 215 
troughs, 140 
zero-crossings, 

140, 254 

Weber-Fechner law, 160 
Webern, Antón, 311, 331 
Weierstrassfunction, 355 
weight, 100,109 


well tempered, 69 
whole step, 17 
wholetone, 17 
wolf fifth, 53 
work, 110 

Xenakis, lannis, 332 
Xeno's paradox, 414 
xylophone, 263 

Young's modulus, 257-258, 260 
Zarlino, Gioseffo, 60, 333 



